Multi Model Critique

v1.0.1

Use multiple models in a 4-step cycle of drafting, cross-critique, revision, and synthesis to generate higher-quality answers for complex, high-stakes queries.

0· 358·1 current·1 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (multi-model critique) align with the provided artifacts: orchestration templates, prompt templates, output schema, and two helper scripts. The skill requires resolving ACP agentIds and coordinating multiple models, which matches the stated goal.
Instruction Scope
SKILL.md restricts use to complex=true and describes a bounded 4-round workflow using sessions_spawn/sessions_send/sessions_history for model orchestration. It does not instruct reading unrelated system files or harvesting environment variables. The only potential surprise is a default language preference (Korean) in templates, but that's a functional choice rather than a scope creep.
Install Mechanism
No install spec is provided (instruction-only plus two small local helper scripts). No downloads, package installs, or archive extraction are present. Scripts are local utilities that read/write prompt and plan files.
Credentials
The skill declares no required environment variables, credentials, or config paths. The orchestration relies on platform ACP agentIds (expected for cross-model runs) and does not request unrelated secrets.
Persistence & Privilege
always is false and the skill does not request permanent system presence or attempt to modify other skills. Helper scripts write outputs to user-specified directories only and run_orchestration explicitly states it does not call OpenClaw tools directly.
Assessment
This skill appears to do what it says: orchestrate multiple ACP models through draft → critique → revision → synthesis. Before using it, confirm that you trust the ACP agentIds you will include (the workflow sends user content to all listed models), and ensure the run-time will not point output files to sensitive system locations. Note the templates default to Korean unless overridden. There are no requested credentials or external downloads in the package. If you will run the helper scripts locally, inspect the output directory arguments you provide and ensure the platform's sessions_spawn / sessions_send privileges are granted only to models you trust.

Like a lobster shell, security has layers — review code before you run it.

latestvk971yxht70xd4n2td55skfrccs81z5pq
358downloads
0stars
2versions
Updated 1mo ago
v1.0.1
MIT-0

Multi-Model Critique

Overview

Use this skill only for complex tasks. Route multiple models through the same 4-step loop (Plan -> Execute -> Review -> Improve), then run cross-critique and synthesis to produce a higher-quality final answer than any single-model draft.

Trigger rule

Enable this skill only when the request explicitly sets complex to true (or equivalent wording such as “this is complex/deep”).

If complex is false, skip this skill and respond with normal single-model behavior.

Inputs

Collect or confirm these inputs before execution:

  • complex: boolean flag (must be true)
  • question: user request
  • models: list of ACP agentId values (typically 3)
  • constraints: output format, language, length, deadlines, forbidden assumptions
  • ops: optional runtime controls (timeoutSec, maxRetries, maxRounds, budgetUsd)

File map (what each file does)

  • SKILL.md (this file): orchestration policy, trigger conditions, and execution sequence.
  • references/prompt-templates.md: reusable prompts for draft, critique, revision, and final synthesis (includes scoring rubric usage).
  • references/orchestration-template.md: practical OpenClaw orchestration flow using sessions_spawn, sessions_send, and sessions_history.
  • references/output-schema.md: machine-parseable JSON output schema for final result and per-model scoring.
  • scripts/build_round_prompts.py: utility to generate per-model prompt files for repeated runs.
  • scripts/run_orchestration.py: local helper that builds a run plan JSON (model mapping, round prompts, runtime settings).

Workflow

Step 1) Parallel draft round

Spawn one ACP session per model with the same task and constraints.

Per-model requirements:

  • Follow the exact internal sequence: Plan -> Execute -> Review -> Improve
  • Print all four sections explicitly
  • End with Draft Answer

Use sessions_spawn with runtime:"acp" and explicit agentId.

Step 2) Cross-critique round

Share peer Draft Answer outputs with each model and require structured critique:

  • Strengths
  • Weaknesses
  • Missing assumptions/data
  • Hallucination and confidence risks
  • Concrete fix suggestions

Also require ranking of peer drafts with rationale.

Step 3) Revision round

Send critique feedback back to each original model and request revision:

  • Keep Plan -> Execute -> Review -> Improve
  • Include Changes from Critique
  • End with Revised Answer

Step 4) Final synthesis round

Integrate revised answers into one user-facing output:

  • Best final answer
  • Why the synthesis is stronger than individual drafts
  • Remaining uncertainties
  • Optional next actions

Scoring rubric (required in critique + synthesis)

Score each draft on a 1-5 scale:

  • accuracy: factual correctness and internal consistency
  • coverage: completeness against user request and constraints
  • evidence: quality of assumptions and support
  • actionability: usefulness for concrete decision/action

Default weighted score: 0.40 * accuracy + 0.25 * coverage + 0.20 * evidence + 0.15 * actionability

Use this score to justify rankings and the final selected direction.

Prompting resources

  • Use references/prompt-templates.md for canonical prompts.
  • Use scripts/build_round_prompts.py when you need file-based prompt generation for repeated or batched runs.
  • Use scripts/run_orchestration.py to generate a deterministic run-plan artifact for reproducible execution.
  • Use references/orchestration-template.md for concrete OpenClaw tool-call flow.

Required user-facing output shape

  1. Final Answer
  2. Key Improvements from Critique
  3. Uncertainties
  4. Next Steps (optional)

When machine consumption is needed, return JSON matching references/output-schema.md.

Do not expose private chain-of-thought. Provide concise reasoning summaries only.

Failure handling

  • One model fails: continue with remaining models and note reduced diversity.
  • Two or more models fail: ask whether to retry or switch to single-model mode.
  • Strong disagreement remains: present competing hypotheses and state what evidence would resolve them.

Runtime defaults (recommended)

  • timeoutSec: 180 per round per model
  • maxRetries: 1 per failed model turn
  • maxRounds: fixed at 4 (draft, critique, revision, synthesis)
  • budgetUsd: optional hard stop when cost-sensitive

Comments

Loading comments...