Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Test-Driven Revolution

v2.0.1

Test-Driven Revolution implements an AI-driven iterative code evolution system with automated coding, testing, auditing, and controlled task workflows.

0· 115·1 current·1 all-time
byJaden's built a claw@cjboy007

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for cjboy007/test-driven-revolution.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Test-Driven Revolution" (cjboy007/test-driven-revolution) from ClawHub.
Skill page: https://clawhub.ai/cjboy007/test-driven-revolution
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install test-driven-revolution

ClawHub CLI

Package manager switcher

npx clawhub@latest install test-driven-revolution
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Suspicious
high confidence
Purpose & Capability
The name/description (AI-driven iterative code evolution) aligns with the provided scripts (planning, review, execute, audit, heartbeats, locking). The files implement the planner/reviewer/executor/auditor workflow the skill advertises.
!
Instruction Scope
SKILL.md instructs running the included scripts and scheduling heartbeats; the runtime scripts (iron-heartbeat.js, heartbeat-coordinator.js, auto-plan.js, apply-review.js, etc.) read/write task files and may execute commands produced by model reviews. In particular, executeInSandbox ultimately runs execSync(instructions) where instructions come from review.next_instructions — i.e., model-controlled shell commands are executed in the workspace. Although the docs mention sandboxing and security scans, the code does not enforce a sandbox (comments say Docker/nsjail recommended but not used) and relies on security-scan and user/manual steps that may be bypassed. This expands scope beyond safe, narrowly-scoped actions.
Install Mechanism
There is no install spec (instruction-only from registry perspective) and code files are provided. Nothing in the package pulls arbitrary external binaries; risk comes from runtime execution rather than install-time downloads.
Credentials
The skill declares no required environment variables or credentials, which is consistent with the files (they assume model access provided by the OpenClaw environment). There are references to model names and sessions_spawn, but no unexpected credential requests embedded in the manifest. That said, the skill will execute arbitrary commands and could access any files in the workspace without asking for additional env secrets.
Persistence & Privilege
always is false and the skill does not request to modify other skills. However, SKILL.md and README recommend scheduling cron heartbeats and creating long-running agent heartbeats (Wilson/Iron/Auditor), which gives the skill a persistent operational presence if the user follows those instructions. Persistent execution combined with arbitrary command execution increases blast radius if misused.
What to consider before installing
This skill implements an automated pipeline that accepts model-generated instructions and runs them as shell commands in your workspace. Before installing or scheduling its heartbeats: - Do NOT run these scripts on a machine with sensitive data or credentials. Use an isolated VM or container. - Inspect security-scan.js and confirm it reliably blocks dangerous patterns you care about (it is referenced but enforcement appears partial). - Replace or enforce a real sandbox (Docker, nsjail, or other containerization) for executeInSandbox instead of the current execSync-based execution. The code has comments recommending sandboxing but does not enforce it. - Avoid automatically scheduling cron jobs until you trust the code; run tasks manually first and watch logs/events.log. - Review how review.next_instructions is produced and validated: model outputs can include network exfiltration, credential-stealing commands, or destructive operations. Ensure review outputs are human-reviewed or strictly validated before execution. - If you want to proceed, run the skill in a tightly restricted environment, audit logs regularly, and do not grant the host any secrets or cloud credentials accessible from the skill's workspace. Given the mismatch between claimed safety controls (sandboxing, scans) and the current implementation (direct execSync), treat this skill as high-risk until you harden execution and verification paths.
scripts/heartbeat-coordinator.js:50
Shell command execution detected (child_process).
scripts/iron-heartbeat.js:126
Shell command execution detected (child_process).
Patterns worth reviewing
These patterns may indicate risky behavior. Check the VirusTotal and OpenClaw results above for context-aware analysis before installing.

Like a lobster shell, security has layers — review code before you run it.

latestvk975j4mdf9853sba4jb218yey183t4kx
115downloads
0stars
5versions
Updated 1mo ago
v2.0.1
MIT-0

Test-Driven Revolution Skill

Version: 2.0.0
AgentSkills: v1.1.0
Author: OpenClaw Community
Created: 2026-03-28
Updated: 2026-03-29


描述

Test-Driven Revolution (TDR) 是一个测试驱动的 AI 自动进化系统

AI 写代码 → AI 测试 → AI 改 bug → 测试通过 → 下一轮

与 Auto Revolution 的关系:

  • TDR = 用户 interface(AgentSkills 技能包,手动触发)
  • Auto Revolution = 底层执行引擎(Cron 心跳自动执行)
  • 两者共享同一套配置和脚本,只是触发方式不同

触发方式对比:

系统触发方式适用场景
TDR用户说"用 TDR 创建 XX"主动开发新功能
Auto RevolutionCron 定时任务(每 5 分钟)后台自动执行任务队列

核心工作流:

  1. 任务分析 - 分析任务难度/风险/时间要求,推荐流程
  2. 用户选择 - 用户确认流程类型(简化/完整/高级)
  3. 执行流程 - 按选择的流程执行(审阅→执行→审核)
  4. 循环迭代 - 审核通过→下一轮;审核失败→打回重做

角色说明:

  • Planner - 负责任务分析和流程推荐(主 Agent)
  • Reviewer - 负责技术选型审查和指令生成(高级模型)
  • Executor - 负责代码执行和文件操作(默认模型)⭐
  • Auditor - 负责质量审核和验证(高级模型)

系统特性:

  • 📊 任务难度分析 - 自动评估复杂度/风险/时间要求
  • 🎯 流程推荐 - 根据任务类型推荐简化/完整/高级流程
  • 👤 用户选择 - 推荐后由用户确认最终流程
  • 🔒 原子锁mkdir 原子操作,防止并发竞态
  • 🛡️ 安全扫描:执行前检测危险命令模式
  • 📝 事件日志:所有操作追加 JSONL 日志
  • 🔗 依赖激活:依赖完成后自动激活下游任务
  • 💰 成本控制:Executor 统一使用默认模型

触发关键词

主动触发(推荐):

  • "用 TDR 创建一个 HTTP 服务器"
  • "启动 TDR,做个报价单生成器"
  • "运行进化系统,开发 XX 功能"
  • "创建 TDR 任务"

不要触发(直接执行):

  • "创建一个频道" → 直接调用 message 工具
  • "给张三发邮件" → 直接调用邮件工具
  • "查天气" → 直接调用 weather 工具
  • "创建文件" → 直接写文件

适用场景 ⭐

✅ 适合用 TDR

场景示例推荐流程
复杂功能开发"用 TDR 做个邮件自动回复系统"完整流程
代码生成 + 测试"用 TDR 创建 API 客户端,带单元测试"完整流程
多步骤任务"用 TDR 开发完整的 CRUD 接口"完整流程
需要质量保证"用 TDR 开发支付模块"高级流程
文档更新"用 TDR 补齐 references 文档"简化流程
Bug 修复(紧急)"用 TDR 修复登录漏洞"简化流程

❌ 不适合用 TDR

场景正确做法原因
简单操作直接调用工具不需要迭代测试
一次性任务直接执行TDR 流程太重
外部 API 调用直接调用 APITDR 不直接调外部 API
日常对话正常回复不需要代码生成

三种流程模式

1. 简化流程 ⚡

步骤: Executor 直连(无审阅/审核)

模型: Executor = 默认模型

耗时: ~5 分钟/任务

成本: 🆓 免费(使用包月额度)

适用场景:

  • 文档更新(SKILL.md、参考文档)
  • 简单 Bug 修复(紧急)
  • 批量改进(低风险)
  • 代码量 <100 行

示例:

node scripts/auto-plan.js "更新 references 文档"
# 自动选择:简化流程

2. 完整流程 📊

步骤: Reviewer → Executor → Auditor

模型:

  • Reviewer: 高级模型 → fallback → 备用模型
  • Executor: 默认模型
  • Auditor: 高级模型 → fallback → 备用模型

耗时: ~15 分钟/任务

成本: 🆓 免费(使用包月额度)

适用场景:

  • 新功能实现
  • 代码重构(中等风险)
  • 多文件修改
  • 代码量 100-500 行

示例:

node scripts/auto-plan.js "实现 forward 命令"
# 自动选择:完整流程

3. 高级流程 🏆

步骤: Reviewer → Executor → Auditor

模型:

  • Reviewer: 顶级模型
  • Executor: 默认模型
  • Auditor: 顶级模型

耗时: ~15 分钟/任务

成本: 💰$$(按量计费,约 $0.5-2/任务)

适用场景:

  • 核心功能开发
  • 安全相关修复
  • 生产发布前验证
  • 代码量 >500 行

示例:

node scripts/auto-plan.js "修复 prompt injection 漏洞"
# 自动选择:高级流程

任务难度评估

评估维度

维度简单中等复杂
代码量<100 行100-500 行>500 行
文件数1-2 个3-5 个>5 个
依赖部分跨模块
风险
可回滚部分

自动推荐规则

任务类型默认流程例外情况
文档更新简化流程大规模重构→完整
Bug 修复简化流程安全漏洞→高级
新功能完整流程核心功能→高级
代码重构完整流程核心架构→高级
安全修复高级流程-
批量改进简化流程影响核心→完整

使用方法

0. 自动分析 + 用户选择(推荐)⭐

全自动模式:

# Step 1: 分析任务并推荐流程
node scripts/auto-plan.js "创建一个 HTTP 服务器,监听 3000 端口"

# 输出:
# 📋 任务分析
# - 复杂度:中等(预计 300 行代码,3 个文件)
# - 风险:中(新功能,不影响现有模块)
# - 时间:正常
# 
# 🎯 流程推荐:完整流程
# 理由:
# 1. 新功能开发,需要质量保证
# 2. 多文件修改(route/controller/test)
# 3. 预计耗时 ~15 分钟
# 
# ❓ 请选择流程:
# A. 简化流程(5 分钟,无审核)
# B. 完整流程(15 分钟,有审核)⭐ 推荐
# C. 高级流程(15 分钟,顶级模型审核,付费)

用户确认后执行:

# 用户回复 "B" 或 "完整流程"
node scripts/auto-plan.js --confirm B

1. 手动创建任务

tasks/ 目录下创建 JSON 文件:

cat > tasks/task-001.json << 'EOF'
{
  "task_id": "task-001",
  "title": "手动创建的任务",
  "description": "任务描述",
  "status": "pending",
  "priority": "P1",
  "flow": "full",
  "subtasks": [
    {
      "id": 0,
      "title": "子任务 1",
      "description": "描述"
    }
  ]
}
EOF

flow 字段可选值:

  • simplified - 简化流程
  • full - 完整流程
  • advanced - 高级流程

2. 执行任务

# 自动执行(按任务配置的 flow 字段)
node scripts/auto-execute.js task-001

# 手动指定流程
node scripts/auto-execute.js task-001 --flow full

配置文件

位置: config/models.json

{
  "roles": {
    "reviewer": {
      "primary": "高级模型",
      "fallback": "备用模型"
    },
    "executor": "默认模型",
    "auditor": {
      "primary": "高级模型",
      "fallback": "备用模型"
    }
  },
  "timeouts": {
    "reviewer": 300,
    "executor": 300,
    "auditor": 180
  },
  "enforceAudit": true,
  "executorDefault": "默认模型"
}

流程对比表

特性简化流程完整流程高级流程
步骤ExecutorReviewer→Executor→AuditorReviewer→Executor→Auditor
Executor默认模型默认模型默认模型
Reviewer-高级模型顶级模型
Auditor-高级模型顶级模型
耗时~5m~15m~15m
成本🆓🆓💰$$
成功率~80%~90%~95%
适用文档/简单 Bug新功能/重构核心/安全

安全规则

执行前检查

  1. 危险命令检测 - 禁止 rm -rfDROP TABLE
  2. 写入路径验证 - 限制在 workspace 目录内
  3. 外部 API 调用 - 需要用户确认
  4. 大额扣费 - >$100 需要人工审批

执行中监控

  1. 原子锁 - 防止并发冲突
  2. 事件日志 - 所有操作记录到 JSONL
  3. 超时保护 - 超时自动终止

执行后审核

  1. Auditor 验证 - 完整流程/高级流程必须
  2. 测试覆盖 - 关键功能必须通过测试
  3. 回滚方案 - 高风险操作必须提供

错误处理

审核失败

审核失败 → current_iteration++ → 打回 Executor 重做
最多 3 次迭代 → 标记 failed → 人工介入

超时处理

超时 → 重试(最多 2 次) → 切换 fallback 模型 → 仍失败则标记 failed

模型认证失败

401 认证失败 → 切换 fallback 模型 → 记录错误日志

最佳实践

1. 任务拆解

好:

{
  "subtasks": [
    {"title": "创建 HTTP 服务器", "description": "监听 3000 端口"},
    {"title": "添加路由", "description": "GET /health, GET /api/data"},
    {"title": "编写测试", "description": "单元测试覆盖率>80%"}
  ]
}

不好:

{
  "subtasks": [
    {"title": "完成整个项目"}  // 太笼统
  ]
}

2. 流程选择

  • 文档更新 → 简化流程(快速、免费)
  • 新功能 → 完整流程(有审核、免费)
  • 安全修复 → 高级流程(顶级模型审核、付费)

3. 迭代控制

  • 单次迭代 <10 分钟
  • 最多 3 次迭代
  • 3 次失败后人工介入

版本历史

版本日期更新内容
2.0.02026-03-29添加任务难度分析、三种流程模式、用户选择机制
1.0.02026-03-28初始版本

相关文档


最后更新: 2026-03-29
维护者: OpenClaw Community

Comments

Loading comments...