Jury Review

v1.0.0

动态评审团多维评分与周期迭代工作流程。根据任务自动生成适合的评审团成员，支持极端评审官加入，形成包围式评审阵势。触发词：评审团、多维评分、迭代优化、代码评审、质量评估、周期迭代、动态评审。

⭐ 0· 152·0 current·0 all-time

by@kukuxnd

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for kukuxnd/jury-review.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Jury Review" (kukuxnd/jury-review) from ClawHub.
Skill page: https://clawhub.ai/kukuxnd/jury-review
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install jury-review

ClawHub CLI

Package manager switcher

npx clawhub@latest install jury-review

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The SKILL.md describes a juried review workflow and the repo includes a scoring guide and a local Python scorer (scripts/jury-scorer.py) that implements detection rules and produces JSON results — these are coherent with the stated goal.

ℹ

Instruction Scope

The SKILL.md instructs the agent to analyze tasks, generate reviewer roles, and iteratively generate/improve code based on feedback. Those instructions stay within the declared purpose (code review and iteration). Note: the workflow implies generating and assessing code; the skill does not instruct reading unrelated system secrets, but you should avoid passing sensitive files to the scorer (it reads arbitrary code files you point it at).

✓

Install Mechanism

No install spec or external downloads; the skill is instruction-only with a small included Python script. No unusual installers, archive downloads, or third‑party packages are declared.

✓

Credentials

The skill requests no environment variables, credentials, or config paths. The included script operates on a code file passed as an argument and does not access environment secrets.

✓

Persistence & Privilege

always:false and no special persistence or system configuration changes are requested. The skill can be invoked by the agent (default), which is expected for user-invocable review tools.

Assessment

This skill appears internally consistent and contains a local Python reviewer that flags common issues (e.g., strcpy, system(), scanf patterns, nested loops). Before installing/use: (1) review scripts/jury-scorer.py yourself to confirm you accept its checks and outputs; (2) don't point the scorer at sensitive files (it opens arbitrary code files passed as an argument); (3) be cautious about automatically executing any code produced by the skill — generated code should be reviewed and tested in a safe environment; (4) because the skill's source is from an unknown publisher, prefer manual inspection before granting it broader automated access or running it on production data.

Like a lobster shell, security has layers — review code before you run it.

latestvk970p18tvkm52r334aspeehq7n838p0k

152downloads

0stars

1versions

Updated 1mo ago

v1.0.0

MIT-0

动态评审团多维评分系统

基于 AutoResearch 思路的智能评审框架。

核心理念

任务分析 → 生成评审团 → 极端挑战 → 用户选择 → 终极评审团 → 迭代优化

工作流程

Phase 1: 任务分析

分析用户任务，识别关键维度：

task = "创建一个高并发的 C++ HTTP 服务器"

analysis = {
    "type": "网络服务",
    "keywords": ["高并发", "HTTP", "服务器", "C++"],
    "risk_areas": ["并发安全", "内存管理", "网络协议"],
    "quality_focus": ["性能", "安全", "稳定性"]
}

Phase 2: 生成核心评审团

根据任务类型，生成"上下左右"包围阵势的核心评审团：

                    【上】架构官
                       ↓
【左】安全官 ←─── 核心代码 ───→ 【右】性能官
                       ↑
                    【下】测试官

核心评审团生成规则：

任务类型	核心评审团	说明
网络服务	架构官、安全官、性能官、测试官	四方包围
数据处理	数据官、性能官、安全官、文档官	数据为中心
UI/前端	美术官、体验官、性能官、兼容官	用户为中心
算法/AI	算法官、性能官、测试官、伦理官	质量为先
安全工具	安全官、渗透官、合规官、审计官	安全至上
通用代码	美术官、性能官、安全官、测试官、文档官	五官齐全

Phase 3: 极端评审团挑战

生成"极端评审官"，质疑核心评审团的盲点：

极端评审官类型：

极端评审官	职责	挑战问题
🔥 纵火官	破坏性测试	"如果故意传入恶意输入会怎样？"
🧟 僵尸官	边界极端	"如果内存只剩 1KB 怎么办？"
⏰ 时间官	时间压力	"如果要在 10ms 内完成怎么办？"
💀 死神官	失败场景	"如果这个函数崩溃了怎么办？"
🎭 骗子官	欺骗输入	"如果用户谎称输入类型怎么办？"
🌀 混沌官	随机异常	"如果网络突然断开怎么办？"
📉 吝啬官	资源极限	"如果 CPU 占用必须 < 1% 怎么办？"
🌪️ 风暴官	高压负载	"如果并发 100 万请求怎么办？"

Phase 4: 用户选择

向用户展示极端评审团，选择加入：

## 🎭 极端评审官提议

根据您的任务特点，建议考虑以下极端评审官：

| 评审官 | 挑战维度 | 推荐理由 |
|--------|----------|----------|
| 🔥 纵火官 | 破坏性测试 | 网络服务需要抵抗恶意输入 |
| 🌀 混沌官 | 异常处理 | 高并发场景网络不稳定 |
| 🌪️ 风暴官 | 极限负载 | 高并发需要压测验证 |

**请选择要加入的极端评审官：**
- [ ] 全部加入
- [ ] 选择加入（指定）
- [ ] 不加入，使用核心评审团

Phase 5: 终极评审团

组合核心 + 极端，形成本次任务的终极评审团：

## ⚔️ 终极评审团阵容

### 核心阵势

    【架构官】赵构
        ↓

【安全官】盾山 ─── 代码 ─── 【性能官】闪电 ↑ 【测试官】试金石


### 极端挑战

🔥 纵火官·焚天 | 🌀 混沌官·乱舞 | 🌪️ 风暴官·狂啸


共 7 位评审官，综合权重自动分配。

Phase 6: 多轮迭代

for iteration in range(max_iterations):
    # 1. 生成/改进代码
    code = generate_or_improve(task, previous_feedback)
    
    # 2. 核心评审团评分
    core_scores = core_jury.evaluate(code)
    
    # 3. 极端评审官挑战
    extreme_challenges = extreme_jury.challenge(code)
    
    # 4. 综合得分
    total = weighted_average(core_scores, extreme_challenges)
    
    # 5. 决策
    if total >= threshold:
        return ACCEPT, code
    elif no_improvement:
        return STAGNANT, best_code
    else:
        feedback = generate_feedback(core_scores, extreme_challenges)
        continue

评审官角色库

核心评审官

评审官	符号	维度	权重范围
🎨 美术官	🎨	代码美学	10-25%
⚡ 性能官	⚡	执行效率	10-25%
🔒 安全官	🔒	安全性	10-25%
🧪 测试官	🧪	测试质量	10-25%
📝 文档官	📝	文档完整	10-25%
🏗️ 架构官	🏗️	架构设计	10-20%
📊 数据官	📊	数据处理	10-20%
👁️ 体验官	👁️	用户体验	10-20%
⚖️ 合规官	⚖️	合规性	10-20%
🤖 算法官	🤖	算法质量	10-20%

极端评审官

评审官	符号	挑战类型	适用场景
🔥 纵火官	🔥	破坏性测试	网络、安全、输入处理
🧟 僵尸官	🧟	资源极限	嵌入式、移动端
⏰ 时间官	⏰	时间压力	实时系统、高频交易
💀 死神官	💀	失败恢复	关键系统、金融
🎭 骗子官	🎭	输入欺骗	用户输入、API
🌀 混沌官	🌀	随机异常	分布式、网络
📉 吝啬官	📉	资源极限	性能敏感
🌪️ 风暴官	🌪️	极限负载	高并发、游戏

配置参数

参数	默认值	说明
`max_iterations`	5	最大迭代次数
`accept_threshold`	80	接受阈值
`min_improvement`	5	最低改进分数
`core_jury_size`	4-5	核心评审团人数
`extreme_jury_max`	3	极端评审官最大数

使用示例

示例 1: 高并发服务器

用户: 创建一个高并发 C++ HTTP 服务器

系统分析:
- 类型: 网络服务
- 关键词: 高并发、HTTP、服务器
- 风险点: 并发安全、内存泄漏、连接管理

生成核心评审团:
        【架构官】
            ↓
【安全官】─── 代码 ───【性能官】
            ↑
        【测试官】

极端评审官提议:
- 🔥 纵火官 (恶意请求)
- 🌪️ 风暴官 (极限并发)
- 🌀 混沌官 (网络异常)

用户选择: 全部加入

终极评审团: 7 位评审官
开始多轮迭代...

示例 2: 数据处理脚本

用户: 写一个 Python 数据清洗脚本

系统分析:
- 类型: 数据处理
- 关键词: 数据、清洗、脚本

生成核心评审团:
        【数据官】
            ↓
【安全官】─── 代码 ───【性能官】
            ↓
        【文档官】

极端评审官提议:
- 🎭 骗子官 (脏数据)
- 💀 死神官 (数据丢失)

用户选择: 加入骗子官

终极评审团: 5 位评审官
开始多轮迭代...

反馈输出格式

## ⚔️ 第 N 轮评审

### 核心评分
| 评审官 | 分数 | 状态 | 主要问题 |
|--------|------|------|----------|
| 🏗️ 架构官 | 82 | ✅ | 模块划分清晰 |
| ⚡ 性能官 | 75 | ⚠️ | 可优化连接池 |
| 🔒 安全官 | 68 | ⚠️ | 缺少输入验证 |
| 🧪 测试官 | 60 | ⚠️ | 测试覆盖不足 |

### 极端挑战
| 评审官 | 通过 | 挑战结果 |
|--------|------|----------|
| 🔥 纵火官 | ❌ | 恶意请求导致崩溃 |
| 🌪️ 风暴官 | ⚠️ | 10K 并发延迟增加 |

**综合得分: 71.2 分**
**状态: 继续迭代**

### 改进建议
1. [安全] 添加请求头验证
2. [测试] 添加并发测试用例
3. [极限] 增加请求速率限制

注意事项

极端评审官数量适中，避免过度惩罚
每轮迭代要有明确改进目标
迭代停滞时及时终止
记录评审历史用于分析优化

Comments

Loading comments...