Install
openclaw skills install reflexionClosed-loop learning for AI coding agents. Auto-captures errors and corrections, recalls relevant past solutions when similar situations arise, and promotes recurring patterns to project memory. Use when: errors occur, user corrects the agent, a non-obvious solution is found, or before starting tasks in areas with past learnings.
openclaw skills install reflexionClosed-loop learning for AI coding agents. Inspired by Reflexion: Language Agents with Verbal Reinforcement Learning.
"Reflexion agents verbally reflect on task feedback signals, then maintain their own reflective text in an episodic memory buffer to induce better decision-making in subsequent trials." — Shinn et al., 2023
The problem: AI agents repeat the same mistakes across sessions. They don't learn from errors, don't remember corrections, and every new conversation starts from zero.
The solution: A capture-recall-promote loop that closes the feedback gap.
Error/Correction occurs
|
[CAPTURE] -----> .reflexion/entries/
| |
Next similar task [INDEX] keywords
| |
[RECALL] <--- keyword match on prompt
|
Inject past solution into context
|
[VERIFY] did it work?
| |
Yes No ---> update entry, flag for review
|
occurrences >= 3?
| |
Yes No ---> increment counter
|
[PROMOTE] append rule to CLAUDE.md
| Situation | What Happens |
|---|---|
| Command fails | capture.sh auto-logs error + context to .reflexion/entries/ |
| User corrects agent | Agent calls capture.sh with correction details |
| Similar prompt later | recall.sh finds matching entries, injects solutions into context |
| Pattern seen 3+ times | promote.sh auto-appends a concise rule to CLAUDE.md |
| Want to see stats | Run ./scripts/status.sh for learning dashboard |
# Clone into your project or global skills
git clone https://github.com/user/reflexion.git .claude/skills/reflexion
# Or copy into an existing skills directory
cp -r reflexion/ ~/.claude/skills/reflexion
Add hooks to .claude/settings.json:
{
"hooks": {
"PostToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "./.claude/skills/reflexion/scripts/capture.sh"
}
]
}
],
"UserPromptSubmit": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": "./.claude/skills/reflexion/scripts/recall.sh"
}
]
}
]
}
}
The scripts auto-initialize on first use. No setup needed. To manually initialize:
./scripts/init.sh
The capture.sh hook fires after every Bash tool use. It reads the tool output from stdin (JSON), detects errors via pattern matching, and stores structured entries:
{
"id": "RFX-20260331-a7f",
"type": "error",
"trigger": "npm ERR! Missing script: \"build\"",
"context": "npm run build",
"resolution": "",
"keywords": ["npm", "build", "missing", "script"],
"occurrences": 1,
"first_seen": "2026-03-31",
"last_seen": "2026-03-31",
"promoted": false,
"cwd": "/home/user/project"
}
When the agent (or user) resolves the error, the agent should update the entry:
Update .reflexion/entries/RFX-20260331-a7f.json with resolution:
"Use pnpm run build - this project uses pnpm, not npm"
The recall.sh hook fires before every user prompt. It extracts keywords from the prompt, searches the entry index, and injects relevant past learnings:
<reflexion-recall>
Past learning [RFX-20260331-a7f] (seen 2x):
Trigger: npm ERR! Missing script: "build"
Resolution: Use pnpm run build - this project uses pnpm, not npm
Keywords: npm, build, missing, script
</reflexion-recall>
This costs ~50-80 tokens when matches exist, zero when they don't.
When an entry hits 3+ occurrences, promote.sh appends a concise rule to CLAUDE.md:
<!-- reflexion:auto-promoted -->
## Reflexion: Learned Rules
- This project uses pnpm, not npm. Always use `pnpm run` commands. (seen 3x, source: RFX-20260331-a7f)
Promoted entries are marked "promoted": true and stop being injected via recall (the rule is now in CLAUDE.md permanently).
After the agent applies a recalled solution, it should verify and update:
occurrences, update last_seenThis step is agent-driven (via prompt instruction), not hook-automated, to avoid false positives.
| Type | Trigger | Example |
|---|---|---|
error | Command failure detected by hook | npm ERR!, Permission denied, ModuleNotFoundError |
correction | User says "no", "actually", "wrong" | "Actually use pnpm, not npm" |
insight | Non-obvious solution discovered | "Must run codegen after API changes" |
pattern | Recurring approach that works | "Always check auth status before git push" |
Entries live in .reflexion/entries/ as individual JSON files (one per learning). This enables:
The keyword index at .reflexion/index.txt maps keywords to entry IDs for fast recall:
npm:RFX-20260331-a7f,RFX-20260401-b2c
build:RFX-20260331-a7f
pnpm:RFX-20260331-a7f,RFX-20260401-b2c
docker:RFX-20260402-c1d
An entry is auto-promoted to CLAUDE.md when ALL conditions are met:
occurrences >= 3resolution is non-empty (the fix is known)promoted is false (not already promoted)Promoted rules are written as short, actionable directives. Not incident reports.
When this skill is active, follow these behaviors:
capture.sh already logged it (it runs automatically on Bash errors)resolution fieldLog a correction entry manually:
cat > .reflexion/entries/RFX-$(date +%Y%m%d)-$(head -c3 /dev/urandom | xxd -p | head -c3).json << 'ENTRY'
{
"id": "RFX-...",
"type": "correction",
"trigger": "user said: actually use pnpm",
"context": "attempted npm install",
"resolution": "this project uses pnpm, not npm",
"keywords": ["npm", "pnpm", "install", "package-manager"],
"occurrences": 1,
"first_seen": "2026-03-31",
"last_seen": "2026-03-31",
"promoted": false
}
ENTRY
Then rebuild the index: ./scripts/rebuild-index.sh
When <reflexion-recall> context appears in the prompt:
Run ./scripts/status.sh to see if there are relevant learnings for the area you're about to work in.
capture.sh script redacts common secret patterns (Bearer tokens, API keys, passwords).reflexion/ should be in .gitignore for private projects.reflexion/ creates shared learning (opt-in)| Feature | self-improving-agent | OMC auto-learner | reflexion |
|---|---|---|---|
| Auto-capture errors | Hook reminder only | Pattern detection | Hook + auto-parse + store |
| Structured storage | Markdown append | Content hash dedup | JSON entries + keyword index |
| Cross-session recall | None | None | Auto keyword match + inject |
| Auto-promote to CLAUDE.md | Manual | Manual | Auto at 3 occurrences |
| Token overhead | ~70 tokens always | Variable | 0 tokens when no match, ~60 on match |
| Correction capture | Reminder to log | Confidence scoring | Structured entry with resolution |
| Works offline | Yes | Yes | Yes |
| Dependencies | bash | TypeScript + npm | bash + grep (zero deps) |
reflexion/
├── SKILL.md # This file
├── scripts/
│ ├── init.sh # Initialize .reflexion/ directory
│ ├── capture.sh # PostToolUse hook - auto-capture errors
│ ├── recall.sh # UserPromptSubmit hook - inject past learnings
│ ├── promote.sh # Auto-promote recurring patterns to CLAUDE.md
│ ├── status.sh # Learning stats dashboard
│ └── rebuild-index.sh # Rebuild keyword index from entries
├── assets/
│ └── settings-template.json # Claude Code settings template
└── references/
└── integration.md # Setup guides for different agents
This skill implements the core feedback loop from:
@article{shinn2023reflexion,
title = {Reflexion: Language Agents with Verbal Reinforcement Learning},
author = {Noah Shinn and Federico Cassano and Edward Berman and
Ashwin Gopinath and Karthik Narasimhan and Shunyu Yao},
journal = {arXiv preprint arXiv:2303.11366},
year = {2023},
url = {https://arxiv.org/abs/2303.11366},
doi = {10.48550/arXiv.2303.11366}
}
The paper showed that language agents reflecting on past failures in an episodic memory buffer significantly outperform base agents — achieving 91% pass@1 on HumanEval vs GPT-4's 80%. This skill adapts that principle for AI coding agents: instead of weight updates, it stores verbal reflections (error entries with resolutions) and retrieves them when similar situations arise.