Install
openclaw skills install self-improvement-llmAutonomous AI memory and self-learning system that logs, extracts lessons, verifies improvements, adapts behavior, manages preferences, and generates reusabl...
openclaw skills install self-improvement-llmA continuous learning loop that automatically captures learnings, tracks improvements, and verifies their effectiveness.
Inspiration: This skill fuses the structured recording format and detection triggers from pskoett/self-improving-agent (6.1k installs) with a verification/hypothesis loop that most agent learning systems lack.
Session / Task
↓
[DETECT] ← Automatic triggers: corrections, errors, feature requests
↓
[LOG] ← Structured entries with IDs, priorities, categories
↓
[EXTRACT] ← Distill patterns from repeated entries
↓
[PROMOTE] ← To AGENTS.md / SOUL.md / TOOLS.md / MEMORY.md
↓
[VERIFY] ← 7-day check: did this change actually help?
↓
[ADAPT] ← Reinforce success, revert failure
↓
(back to detect on next interaction)
The skill also manages the agent's memory system — daily logs, user preferences, and knowledge retention.
┌─────────────────────────────────────────────────────────┐
│ THREE-LAYER MEMORY ARCHITECTURE │
├──────────────┬──────────────────┬───────────────────────┤
│ L1: Session │ L2: Persistent │ L3: User Model │
│ Context │ Store │ Preferences │
│──────────────┼──────────────────┼───────────────────────┤
│ memory/ │ MEMORY.md │ memory/ │
│ sessions/ │ memory/*.md │ preferences.json │
│ (session │ memory/skills/ │ USER.md │
│ summaries) │ (generated │ │
│ │ skills) │ │
└──────────────┴──────────────────┴───────────────────────┘
L1 — 会话上下文
存储: memory/sessions/YYYY-MM-DD-NNN.md
内容: 每次会话的摘要(做了什么、学到了什么、用户说了什么)
生命周期: 自动归档到 memory/YYYY-MM-DD.md,长期保留
L2 — 持久存储
存储: MEMORY.md(蒸馏知识)+ memory/*.md(原始日志)+ memory/skills/(自动生成技能)
内容: 完成的任务结果、经验教训、可复用技能文件
生命周期: 永久保留,MEMORY.md 定期蒸馏
L3 — 用户模型
存储: memory/preferences.json + USER.md
内容: 用户偏好、沟通风格、技术背景、兴趣、已知痛点
生命周期: 持续更新,漂移调整
Inspiration: Nous Research Hermes Agent 三层记忆架构。SQLite + FTS5 被我们替换为文件存储(更轻量,适合 OpenClaw)。
At the end of each significant task or session, automatically append to memory/YYYY-MM-DD.md:
### ✅ 10:30 - Task description
### ❌ 10:35 - Error: brief description
### 💡 10:40 - Insight: what was learned
### 📌 10:45 - User preference: user said X
Keep entries short (1-2 lines). Don't log every tool call — only significant events.
| Type | Layer | Where | Example |
|---|---|---|---|
| Session summaries | L1 | memory/sessions/*.md | "2026-05-27 搜了苏超、装了 SearXNG" |
| Daily logs | L2 | memory/YYYY-MM-DD.md | "10:30 创建 self-improvement skill" |
| Distilled principles | L2 | MEMORY.md | "Simple before powerful" |
| Auto-generated skills | L2 | memory/skills/*.md | "SearXNG 部署流程" |
| User preferences | L3 | memory/preferences.json | "直接回答,不要解释" |
| User profile | L3 | USER.md | "技术背景强,中文沟通" |
| Structured learning | — | .learning-trail.json | 所有 LRN/ERR/FEAT 条目 |
| Memory | Retention | Action |
|---|---|---|
| Daily logs | Keep forever | Append-only, never delete |
| Learning entries | 90 days | Auto-resolve pending items after 90d |
| Verified principles | Keep forever | Part of long-term knowledge |
| User preferences | Keep until changed | Update when user says otherwise |
| Tool notes | Keep until outdated | Update when tools change |
When user asks "之前说过什么" or "帮我回忆一下":
MEMORY.md (distilled knowledge)USER.md (preferences)grep recent memory/*.md files.learning-trail.json for structured entries会话中
→ 检测到用户偏好 / 知识 / 错误
→ 同时写入 memory/YYYY-MM-DD.md(原始)和 .learning-trail.json(结构化)
会话结束(每次对话结束)
→ 自动生成 L1 会话摘要到 memory/sessions/YYYY-MM-DD-NNN.md
→ 摘要包含:做了什么任务、学到了什么、用户反馈、生成了哪些技能
→ 同时追加到 memory/YYYY-MM-DD.md
心跳/空闲
→ 读取 .learning-trail.json 的 patterns
→ 达到阈值的晋升为 MEMORY.md 原则或 memory/preferences.json 偏好
→ 检查是否有值得生成技能的任务(5+ 工具调用)
新会话开始
→ MEMORY.md 自动注入上下文
→ .learning-trail.json 的 watchlist 提醒我注意
Automatically log when you notice:
Corrections → log to LEARNINGS.md (category: correction)
Feature Requests → log to FEATURE_REQUESTS.md
Knowledge Gaps → log to LEARNINGS.md (category: knowledge_gap)
Errors → log to ERRORS.md
Successes → log to LEARNINGS.md (category: best_practice)
| Trigger | When | Action |
|---|---|---|
| Session end | After completion | Auto-log summary to memory/YYYY-MM-DD.md + memory/sessions/ L1 summary |
| Skill gen check | After complex task | Auto-generate skill if 5+ tool calls or user says "记住" |
| Heartbeat | Idle time | Run learn.py --cycle: check verifications, promote patterns |
| Improve yourself | On demand | Full cycle + report |
| Hook | Session start | If hook installed, review pending learnings |
每次会话/任务完成后,自动生成会话摘要到 memory/sessions/YYYY-MM-DD-NNN.md:
# Session Summary: 2026-05-27-001
## Tasks Completed
- [任务名称] 做了什么,结果是什么
## Learnings
- [学到了什么]
## Skills Generated
- [生成了哪些技能文件]
## User Feedback
- [用户说了什么重要反馈]
## Open Items
- [未完成的或待确认的]
生成时机: 一个完整的任务流程结束后(如装完 SearXNG、搜完新闻等)
当完成一个复杂度达标的任务后,自动生成标准化技能文件。
生成条件(满足任意一个):
自动检测机制:
memory/skills/<task-slug>.mdEvery entry uses this format (inspired by pskoett standard):
## [LRN-YYYYMMDD-XXX] category:brief_title
**Logged**: ISO-8601 timestamp
**Priority**: low | medium | high | critical
**Status**: pending | in_progress | resolved | wont_fix | promoted
**Area**: frontend | backend | infra | tests | docs | config | behavior | tooling
### Summary
One-line description
### Details
What happened, what was wrong, what's correct
### Suggested Action
Specific fix or improvement
### Metadata
- Source: conversation | error | user_feedback | self_discovery
- Related Files: path/to/file
- Tags: tag1, tag2
- Pattern-Key: unique_key_for_dedup (optional, for recurring patterns)
- Recurrence-Count: 1
- First-Seen: YYYY-MM-DD
- Last-Seen: YYYY-MM-DD
## [ERR-YYYYMMDD-XXX] tool_or_command_name
**Logged**: ISO-8601 timestamp
**Priority**: high
**Status**: pending
**Area**: infra | tooling | config
### Summary
Brief description of what failed
### Error
Actual error message or output
### Context
- Command/operation attempted
- Input or parameters used
### Suggested Fix
What might resolve this
### Metadata
- Reproducible: yes | no | unknown
- Related Files: path/to/file
- See Also: ERR-YYYYMMDD-XXX (if recurring)
## [FEAT-YYYYMMDD-XXX] capability_name
**Logged**: ISO-8601 timestamp
**Priority**: medium
**Status**: pending
**Area**: as appropriate
### Summary
What the user wanted to do
### User Context
Why they needed it
### Complexity Estimate
simple | medium | complex
### Metadata
- Frequency: first_time | recurring
- Related Features: existing_feature_name
Format: TYPE-YYYYMMDD-XXX
Where to log: The agent logs structured entries to memory/.learning-trail.json (structured, queryable). The helper scripts also write human-readable copies to .learnings/ files if they exist.
When logging something that might already exist:
.learning-trail.json for matching Pattern-KeyPromote a pattern to workspace core files when all are true:
Promotion targets:
| Entry Type | Promote To | Example |
|---|---|---|
| Behavioral pattern | SOUL.md | "Be concise, skip disclaimers" |
| Workflow improvement | AGENTS.md | "Spawn sub-agents for long tasks" |
| Tool gotcha | TOOLS.md | "Git push needs auth configured" |
| User preference | USER.md / preferences.json | "User prefers direct answers" |
| Universal principle | MEMORY.md | "Simple before powerful" |
| Reusable procedure | memory/skills/*.md | "SearXNG 部署流程" |
---
name: skill-slug-name
description: 一句话描述这个技能做什么
created: 2026-05-27
updated: 2026-05-27
source: auto
triggers: ["触发关键词或场景"]
tools: [web_fetch, exec, read]
---
## Procedure
1. 步骤一:做了什么
2. 步骤二:怎么做的
3. 步骤三:验证结果
## Pitfalls
- 已知问题或陷阱
- 容易出错的地方
- 环境依赖
## Verification
- 如何验证结果正确
- 预期输出是什么
技能复用流程:
memory/skills/ 目录匹配关键词When a change is promoted or applied, record a verification entry:
{
"id": "change-20260505-001",
"source": "LRN-20260505-003",
"target": "TOOLS.md",
"change": "Added 'prefer read over exec for files'",
"hypothesis": "This will reduce file-viewing errors",
"verified": false,
"next_check": "2026-05-12",
"evidence": []
}
After 7 days, learn.py --cycle checks:
Verification outcomes:
| Result | Action |
|---|---|
| ✅ Confirmed effective | Mark verified, reduce monitoring to monthly |
| ❌ Ineffective | Revert change, log why it failed |
| ❌ Made worse | Revert immediately, escalate |
| ❓ Inconclusive | Extend monitoring, add more data points |
python3 scripts/learn.py --cycle # Full cycle: check verifications + promote patterns
python3 scripts/learn.py --verify # Only check pending verifications
python3 scripts/learn.py --status # Show learning stats
# Logging with source
python3 scripts/learn.py --log learning "user corrected me on X" --area behavior --source user_feedback --priority high
CLI --log parameters:
| Param | Values | Default |
|---|---|---|
--source | conversation, error, user_feedback, self_discovery | self_discovery |
--priority | critical, high, medium, low | medium |
--area | any string | tooling |
--pattern-key | any string | none |
For automatic reminders at session start, install the hook:
# Copy hook files (HOOK.md + handler.js) to OpenClaw hooks directory
cp skills/self-improvement/hooks/openclaw/HOOK.md ~/.openclaw/hooks/self-improvement/HOOK.md
cp skills/self-improvement/hooks/openclaw/handler.js ~/.openclaw/hooks/self-improvement/handler.js
# Enable it
openclaw hooks enable self-improvement
# Verify
openclaw hooks list
Important: OpenClaw hooks require
HOOK.md+handler.jsat the top level of the hook directory. Shell scripts (hook.sh) are not supported.
The hook checks .learning-trail.json on session start for:
| Situation | Action |
|---|---|
| Command/operation fails | Log to ERRORS.md + auto-log |
| User corrects you | Log to LEARNINGS.md (correction) |
| User wants missing feature | Log to FEATURE_REQUESTS.md |
| API/external tool fails | Log to ERRORS.md |
| Knowledge was outdated | Log to LEARNINGS.md (knowledge_gap) |
| Found better approach | Log to LEARNINGS.md (best_practice) |
| Same error 3x across sessions | Promote to core file |
| Change applied 7+ days ago | Run verification check |
| Priority | When to Use |
|---|---|
| critical | Blocks core functionality, data loss risk, security issue |
| high | Significant impact, affects common workflows, recurring issue |
| medium | Moderate impact, workaround exists |
| low | Minor inconvenience, nice-to-have |
When two principles contradict, the system uses priority scoring to decide which wins:
Score = BasePriority(100/60/30/10) + RecurrenceBonus(×10 each) + RecencyBonus(up to 30) + AreaWeight(up to 50)
Highest score wins.
Example conflict:
When a tie is detected, the system logs it for human review.
Old learnings that aren't reinforced automatically fade:
| Time without reinforcement | Action |
|---|---|
| 30 days | Priority demoted one level (high→medium, etc.) |
| 60 days | Priority → low, flagged as stale |
| 90 days | Auto-resolved as wont_fix |
Reinforcement happens when:
When a verification is overdue by 7+ days without evidence:
| Overdue | Action |
|---|---|
| 7 days | Grace period — reminder only |
| 14 days | First extension + evidence request |
| 21+ days | Auto-revert: change undone, logged as auto_reverted |
The revert is safe because all changes are file-based (TOOLS.md, USER.md, etc.) and the old state is tracked in the learning trail.
When the learning system detects a pattern ready for promotion or a change that needs verification, it generates a proposal for user review:
Pattern detected (≥3x across ≥2 sessions)
↓
Generate proposal: what to change, why, risk level
↓
Present to user for approval
↓
User says "approve N" or "skip N"
↓
Apply approved changes, track for verification
Each proposal includes:
| Change Type | Action | Example |
|---|---|---|
| Add note to TOOLS.md | ✅ Auto-apply | "QWeather needs custom host" |
| Add principle to MEMORY.md | ✅ Auto-apply | "Simple before powerful" |
| Add preference to USER.md | ✅ Auto-apply | "User prefers direct answers" |
| Add guideline to SOUL.md | ⚠️ Propose | "Be concise, skip disclaimers" |
| Add rule to AGENTS.md | ⚠️ Propose | "Spawn sub-agents for long tasks" |
| Create new skill | ❌ Always ask | New skill for recurring task |
python3 scripts/learn.py --propose # Generate proposals for review
The agent will present proposals and wait for your approval before applying.
After each significant interaction, score the response on 5 dimensions (0-10):
| Dimension | What it measures |
|---|---|
| Accuracy | Was the output factually correct? |
| Usefulness | Did it solve the user's actual problem? |
| Efficiency | Were tool calls optimal? |
| Tone | Matched SOUL.md persona? |
| Proactiveness | Anticipated needs? |
python3 scripts/learn.py --score 8 9 7 8 6 # Score last conversation
python3 scripts/learn.py --trends 7 # Show 7-day trend
Scores are stored in .learning-trail.json and displayed as trends:
📈 Score Trends (last 7 days, 12 scores):
Date Avg Acc Use Eff Ton Pro
──────────────────────────────────────────
2026-05-01 7.2 8 8 7 7 6
2026-05-02 7.8 8 9 7 8 7
2026-05-03 8.0 8 9 8 8 7
Trend: ↑ (7.2 → 8.0)
No scores yet = no way to measure improvement. Start scoring after each meaningful interaction.
Instead of injecting ALL of MEMORY.md into every session, the system builds a topic-indexed memory index and injects only relevant memories.
memory/*.md files, detect topics, create .memory-index.json| Topic | Keywords |
|---|---|
| weather | 天气, 温度, wind, rain, 预报 |
| code | 代码, script, python, bug, fix |
| finance | 金融, 股票, stock, 交易 |
| skill | skill, clawhub, 技能 |
| learning | improve, learn, reflect, 学习 |
| memory | memory, remember, recall, 记忆 |
| browser | browser, playwright, 自动化 |
| config | config, 配置, setup, API, key |
python3 scripts/learn.py --build-index # Build topic index
python3 scripts/learn.py --query-memory weather # Query weather memories
The index is automatically rebuilt during --cycle. When a new session starts, the agent detects the topic and queries relevant memories instead of loading everything.
Connect memories into a network: 事件 → 教训 → 原则。
| Type | Icon | Description |
|---|---|---|
| event | 📌 | 具体事件("用了 exec 读文件") |
| lesson | 💡 | 从事件中学到的教训 |
| principle | 📜 | 通用原则("Simple before powerful") |
| knowledge | 📖 | 事实知识("QWeather 需要自定义 Host") |
| pattern | 🔍 | 重复出现的模式 |
| Type | Direction | Meaning |
|---|---|---|
| caused_by | A → B | A 是由 B 引起的 |
| led_to | A → B | A 导致了 B |
| supports | A → B | A 支持 B |
| contradicts | A → B | A 与 B 矛盾 |
| related_to | A → B | A 与 B 相关 |
| derived_from | A → B | A 是从 B 推导出来的 |
# Create nodes
python3 scripts/learn.py --graph-node event "用了 exec 读文件" manual
python3 scripts/learn.py --graph-node lesson "应该用 read 工具" manual
python3 scripts/learn.py --graph-node principle "Simple before powerful" manual
# Create edges
python3 scripts/learn.py --graph-edge eve-XXXX-001 les-XXXX-001 caused_by
python3 scripts/learn.py --graph-edge les-XXXX-001 pri-XXXX-001 led_to
# Auto-link (based on content similarity)
python3 scripts/learn.py --graph-auto-link eve-XXXX-001 "用了 exec 读文件"
# Query graph
python3 scripts/learn.py --graph-query # Show full graph
python3 scripts/learn.py --graph-query type:lesson # Query by type
python3 scripts/learn.py --graph-query eve-XXXX-001 # Query by node ID
When creating a node, the system automatically links it to existing nodes based on content similarity:
related_tocaused_bysupportscontradicts🕸️ Knowledge Graph (4 nodes, 3 edges):
📌 EVENTs (1):
[eve-20260505-001] Used exec for file read instead of read tool
💡 LESSONs (1):
[les-20260505-002] Always use read tool for file viewing, not exec
📜 PRINCIPLEs (1):
[pri-20260505-003] Simple before powerful
📖 KNOWLEDGEs (1):
[kno-20260505-004] QWeather needs custom API host
🔗 Edges:
Always use read tool... ──caused_by──► Used exec for file...
Always use read tool... ──led_to──► Simple before powerful...