Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Skill Garden

v1.0.0

Automatically improves installed skills through passive usage observation and periodic batch analysis. Activates after any skill is used, or when you say gro...

0· 55·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for yjin94606-art/skill-garden.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Skill Garden" (yjin94606-art/skill-garden) from ClawHub.
Skill page: https://clawhub.ai/yjin94606-art/skill-garden
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Canonical install target

openclaw skills install yjin94606-art/skill-garden

ClawHub CLI

Package manager switcher

npx clawhub@latest install skill-garden
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Skill Garden's name and description match its implementation: it reads per-skill usage logs, evaluates skills, generates proposals, and edits SKILL.md files. Requiring read/write access to ~/.openclaw/workspace/skills is coherent with the stated purpose. However, the skill claims to be 'user-in-control' and 'transparent' while also auto-applying changes in multiple confidence bands (see instruction_scope).
!
Instruction Scope
SKILL.md instructs the agent to passively log every skill invocation and to run weekly batch analysis that can auto-apply edits to other skills' SKILL.md. This gives it the ability to collect broad cross-skill telemetry and to modify other skills without a per-change explicit approval (the decision tree auto-applies changes at ≥90% and even 70–89% as [experimental]). That scope is functionally necessary for an auto-improver, but it is a high-risk capability (silent/automatic edits to other skills, broad read/write of skill directories).
Install Mechanism
There is no network download or external installer; the skill is instruction-only with bundled scripts. That's lower risk than arbitrary remote downloads. Files are intended to be run from the user's workspace and operate on local files only.
Credentials
The skill requests no environment variables or external credentials, which is proportionate to its stated function. However, it implicitly requires filesystem access to ~/.openclaw/workspace/skills (read and write), which is not listed in a formal manifest. That filesystem access is necessary for the purpose but is high-privilege because it touches other skills' files and the grower's own memory/logs.
!
Persistence & Privilege
always:false (good) but the skill is allowed to invoke autonomously and — by design — can modify other skills' SKILL.md and maintain its own cron schedule and memory. Modifying other skills' code/content is central to its purpose but is also a significant privilege; the SKILL.md contains inconsistent statements about when changes 'ask first' vs auto-apply, increasing the risk of unexpected permanent changes.
Scan Findings in Context
[modifies_other_skills] expected: The skill's core function is to edit other skills' SKILL.md; this is consistent with its purpose but is a high-privilege action and should be explicitly consented to by the user.
[auto_apply_policy_vs_documentation_inconsistency] unexpected: The SKILL.md philosophy claims 'uncertain ones ask first', but the decision tree and earlier sections apply changes automatically at 70–89% (tagged experimental) and auto-apply at ≥90%; this inconsistency matters because it affects whether edits require an explicit user approval.
[filesystem_access_implicit] expected: Scripts read and write ~/.openclaw/workspace/skills and per-skill references/usage_log.md. No environment variables required, but implicit filesystem access is necessary and should be considered when granting privileges.
[truncated_or_buggy_script] unexpected: The included batch_analyze.py appears truncated in the provided listing (the final returned dict refers to 'efficienc' and the file is '... [truncated]'). This indicates the distributed code may be incomplete or contain copy/paste truncation bugs which could cause runtime exceptions or unpredictable behavior.
What to consider before installing
Before installing or enabling Skill Garden, consider the following: - This skill needs read/write access to your ~/.openclaw/workspace/skills tree and will read logs and edit other skills' SKILL.md files. That is necessary for its purpose but is a powerful privilege — back up your skills directory first. - SKILL.md promises 'user-in-control' but the decision rules auto-apply edits at high and mid confidence (≥90% and 70–89%). If you want manual approval for all edits, change the min-confidence / auto-apply thresholds or run batch_analyze.py in --dry-run mode only. - The bundled batch_analyze.py listing appears truncated/buggy; review the full scripts in a safe environment before letting them run. Look for file-write locations, how edits are applied, and any notification logic. - Prefer a conservative deployment: run the scripts manually (dry-run) first, inspect proposals in references/improvement_proposals.md, and only allow auto-apply after you validate behavior. - If you must test, limit the skill to a copy of your workspace or exclude high-value/production skills from its scope until you’re comfortable. If you want, I can: (1) point out exact file lines that perform writes, (2) suggest a minimal safe configuration (disable auto-apply, change cron to manual), or (3) produce step-by-step instructions to sandbox and test the skill safely.

Like a lobster shell, security has layers — review code before you run it.

latestvk97dcxtmkbhzzjb7y99mgv77ad85a382
55downloads
0stars
1versions
Updated 5d ago
v1.0.0
MIT-0

🌿 Skill Garden — Skill Evolution Engine

"Every skill should get better the more you use it."

Philosophy

Skill Garden treats skill improvement as a continuous, invisible process — not a special operation. It runs passively in the background, accumulating observations from every skill invocation, then periodically synthesizes them into concrete improvements.

Key design principles:

  • Token-efficient: Lightweight structured logs, batch processing, no real-time overhead
  • User-in-control: High-confidence changes auto-apply; uncertain ones ask first
  • Transparent: Every change is explained, nothing happens silently without explanation
  • Self-contained: Manages its own memory, dashboard, proposals, and cron schedule

Three-Layer Architecture

Layer 1 — Passive Observation (near-zero token cost)
  Every skill invocation → 1-line structured log entry
  Abnormal outcomes (FAIL/SLOW) → detailed log with evidence

Layer 2 — Weekly Batch Analysis (isolated agent, ~5-15 min)
  Read all accumulated logs
  Run evaluation engine across 6 dimensions
  Generate specific improvement proposals

Layer 3 — Targeted Modification (low frequency, high precision)
  Confidence ≥ 90% → apply immediately, notify user
  Confidence 70–89% → apply with [experimental] tag
  Confidence 50–69% → write to proposals, ask user
  Confidence < 50% → log as observation only

When This Skill Activates

Trigger 1 — After every skill use (automatic, passive): When any skill finishes executing, immediately log the outcome using the format in references/usage_tracker.md. This is the most important layer — it costs almost nothing and feeds everything else.

Trigger 2 — On user request:

  • "grow this skill" / "improve skill" / "optimize skill"
  • "why did this fail" / "analyze this skill"
  • "run skill analysis" / "check my skills"
  • "skill health" / "skill dashboard"

Trigger 3 — On schedule (automatic, every Sunday 20:00): An isolated agent runs batch_analyze.py and generate_report.py, applies high-confidence improvements, and sends you a summary.

Layer 1: Passive Observation

After any skill finishes (any outcome: OK, FAIL, PARTIAL, SLOW, SKIP), immediately write a structured log. Use log_insight.py or write directly to the skill's references/usage_log.md.

Log Entry Format

For OK outcomes with nothing notable (minimal tokens):

## YYYY-MM-DD HH:MM
Trigger: [trigger in ≤10 words]
Outcome: OK
Signal: [one-line finding or "No issues"]

For PARTIAL, FAIL, SLOW outcomes (always log all fields):

## YYYY-MM-DD HH:MM

### Trigger
[What the user asked for, ≤10 words]

### Outcome
OK | PARTIAL | FAIL | SLOW | SKIP

### Signal
[One specific phrase: what this tells us about the skill]
Examples:
  - "Covered: standard use case works perfectly"
  - "Missing: error handling for network timeouts"
  - "Ambiguous: step 3 could be interpreted two ways"
  - "Outdated: API version in skill doesn't match current"

### Evidence
[1-2 sentences. Quote or paraphrase exact output/error. Be specific.]

### Flags
[Comma-separated tags: [new_trigger] [missing_coverage] [confusing_step]
 [outdated_info] [token_heavy] [edge_case] [user_workaround_used]
 [config_stale] [api_change] [Covered] [success_boost]]

Using log_insight.py

# Quick OK log (minimal)
python3 ~/.openclaw/workspace/skills/skill-garden/scripts/log_insight.py \
  --skill github-trending-summary \
  --trigger "daily top 5 repos" \
  --outcome OK \
  --signal "Covered: standard case"

# Detailed failure log
python3 ~/.openclaw/workspace/skills/skill-garden/scripts/log_insight.py \
  --skill banxuebang-helper \
  --trigger "check homework" \
  --outcome FAIL \
  --signal "Missing: semester selector not dynamic" \
  --evidence "Config hardcoded to 2024-2025 but API shows 2025-2026 is current." \
  --flags "missing_coverage,config_stale" \
  --mark-landmark "SkillImproved"

Rule of thumb: If you had to pause, reconsider, or work around something — log it with full detail. If it just worked perfectly — log minimally. The goal is signal, not noise.

Layer 2: Weekly Batch Analysis

Run manually or wait for the Sunday cron trigger.

Manual Trigger

Say: "run skill analysis" or "grow all skills"

The analysis does the following in order:

  1. Scan all skills — read every references/usage_log.md
  2. Evaluate each skill across 6 dimensions (see references/evaluation_engine.md)
  3. Generate proposals — for each skill with score below threshold
  4. Apply high-confidence changes — auto-edit SKILL.md for confident improvements
  5. Update dashboard — rewrite references/dashboard.md
  6. Notify user — send summary message

Running Scripts Directly

# Full batch analysis (evaluate all skills, generate proposals)
python3 ~/.openclaw/workspace/skills/skill-garden/scripts/batch_analyze.py

# Analyze one skill only
python3 ~/.openclaw/workspace/skills/skill-garden/scripts/batch_analyze.py --skill github-trending-summary

# Dry run (proposals only, don't apply)
python3 ~/.openclaw/workspace/skills/skill-garden/scripts/batch_analyze.py --dry-run --min-confidence 70

# Generate/refresh dashboard
python3 ~/.openclaw/workspace/skills/skill-garden/scripts/generate_report.py

# Output as JSON (for integrations)
python3 ~/.openclaw/workspace/skills/skill-garden/scripts/generate_report.py --output json

Layer 3: Applying Improvements

The Six Evaluation Dimensions

DimensionWeightWhat It Measures
Coverage30%Does the skill's description match how it's actually used?
Completeness25%Are all necessary steps present? Do FAIL events reveal missing coverage?
Clarity20%Are steps unambiguous? Are there [confusing_step] or [user_workaround_used] flags?
Currency15%Is the information still accurate? Are there [outdated_info] or [config_stale] flags?
Efficiency10%Is it unnecessarily verbose or token-heavy?

See references/evaluation_engine.md for the full evaluation algorithm, scoring thresholds, and confidence calibration guide.

Applying an Edit to SKILL.md

When a proposal meets the confidence threshold:

  1. Read the current SKILL.md
  2. Identify the exact text to replace using edit tool
  3. Write the improved version
  4. Add a brief changelog note at the top of the edit:
    <!-- Auto-improved by Skill Garden: YYYY-MM-DD
         Reason: [confidence]% confidence — [evidence summary] -->
    
  5. Update references/improvement_proposals.md to mark as applied
  6. Notify the user with a summary of what changed

Editing Checklist

Before applying any edit:

  • Change is specific and testable (not vague advice)
  • New text is more concrete than old text (examples > statements)
  • If adding a step, verify it doesn't contradict existing steps
  • If removing text, verify no other part of the skill depends on it
  • If changing description, verify all log triggers are now covered
  • Change addresses the flagged evidence, not just the symptom

Dashboard

The dashboard (references/dashboard.md) shows:

  • Overall skill ecosystem health
  • Per-skill scores across all 6 dimensions
  • Recent signals and flags
  • Pending proposals
  • Weekly outcome distribution
  • Recent landmark events

Regenerate with:

python3 ~/.openclaw/workspace/skills/skill-garden/scripts/generate_report.py

Cron Setup

Set up a weekly Sunday 20:00 analysis run:

openclaw cron add \
  --name "Skill Garden Weekly Analysis" \
  --schedule '{"kind":"cron","expr":"0 20 * * 0","tz":"Asia/Shanghai"}' \
  --sessionTarget isolated \
  --payload '{"kind":"agentTurn","message":"Run full skill analysis: execute batch_analyze.py then generate_report.py and notify me of any improvements applied.","timeoutSeconds":900}'

Or manually trigger with:

grow all skills

Reference Files

FilePurpose
references/usage_tracker.mdFull logging schema, log rotation, landmark events
references/evaluation_engine.mdScoring algorithm, thresholds, confidence calibration
references/improvement_examples.mdReal improvement examples with full context
references/dashboard.mdAuto-generated growth dashboard (don't edit manually)
references/master_log.mdGrower's own lightweight master log
scripts/log_insight.pyCLI for writing log entries
scripts/batch_analyze.pyCore analysis engine
scripts/generate_report.pyDashboard generator

Examples in Context

Example 1: Passive Log — Skill Works Perfectly

User: "给我推荐一首今天听的歌" Skill: daily-song-recommender — succeeds, email sent Your action: Write one-line abbreviated log:

## 2026-04-22 09:15
Trigger: daily song recommendation
Outcome: OK
Signal: Covered: standard use case, no issues

Example 2: Passive Log — Skill Fails Due to Missing Coverage

User: "查一下这周的数学作业" Skill: banxuebang-helper — API error: semester field stale Your action: Write full structured log:

## 2026-04-22 21:30

### Trigger
banxuebang: check math homework

### Outcome
FAIL

### Signal
Missing: semester config hardcoded, not dynamic

### Evidence
API returned {"code": 500, "msg": "semester not found"}.
Config in skill has "2024-2025下学期" but actual is "2025-2026下学期".

### Flags
[missing_coverage] [config_stale]

Example 3: Passive Log — New Trigger Not in Description

User: "爬一下GitHub今日热榜发到我邮箱" Skill: github-trending-summary — works, but user's phrasing not in description Your action: Write log with [new_trigger] flag:

## 2026-04-22 10:00

### Trigger
github-trending-summary: crawl and email

### Outcome
OK

### Signal
Covered: but user said "爬" not in description triggers

### Evidence
Skill handled it fine, but description doesn't mention "爬" as a trigger phrase.

### Flags
[new_trigger]

Example 4: User Requests Analysis

User: "run skill analysis" Your action:

  1. Run batch_analyze.py --dry-run
  2. Read the proposals from output
  3. Apply high-confidence changes (≥90%) via edit tool
  4. Run generate_report.py to refresh dashboard
  5. Message user: "Found N improvement(s) — applied X automatically, Y need your review"

Example 5: Weekly Cron Fires (Sunday 20:00)

Isolated agent runs full cycle:

  1. batch_analyze.py scans all 20 installed skills
  2. Finds github-trending-summary: 1 [new_trigger] flag, coverage 66%
  3. Generates proposal with 65% confidence → written to proposals
  4. Updates dashboard
  5. You receive: "🌿 Weekly analysis done. github-trending-summary needs description update (65% confidence — needs more data to auto-apply). Review?"

Comments

Loading comments...