Council of LLMs

v1.0.0

Multi-model deliberation for high-stakes decisions. Don't take one model's word for it.

0· 23·0 current·0 all-time
byWahaj Ahmed@wahajahmed010
Security Scan
Capability signals
Requires sensitive credentials
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (multi-model deliberation) match the files and runtime: SKILL.md, prompts, and scripts orchestrate multiple LLM providers via OpenClaw. Required tools (openclaw, sessions_spawn) are appropriate for the stated goal and no unrelated credentials/binaries are requested.
Instruction Scope
Instructions and the script stick to the council use case (listing models, selecting models, reading an optional review file, and orchestrating debate). Important: when --review is used the script injects file contents into the question (so local file data will be sent to configured models) — this is expected for a code-review task but is a privacy/exfiltration risk if you run it on sensitive files. Also, SKILL.md describes safeguards (cost cap, timeouts, rate limiting) but the included script is an MVP placeholder and does not implement enforcement of those limits.
Install Mechanism
No automated download/install from untrusted URLs. SKILL.md suggests installing via ClawHub or cloning from GitHub — both are standard. There is no extract-from-arbitrary-URL install spec in the manifest.
Credentials
The skill does not request environment variables or credentials directly in registry metadata. SKILL.md recommends configuring provider tokens via OpenClaw (e.g., ollama.cloud.token) which is appropriate. Users should note that providing provider tokens gives the configured models access to any data the skill sends (e.g., review file contents). install.json defines configurable env defaults (COUNCIL_TIMEOUT, COUNCIL_MAX_TOKENS) — reasonable and not excessive.
Persistence & Privilege
always is false and the skill does not request permanent system-wide changes or touch other skills' configs. It reads/writes only its own config path (~/.openclaw/council-config.json) and does not escalate privileges.
Assessment
This skill appears to do what it claims: orchestrate multiple LLMs via OpenClaw. Before installing, be aware: (1) you must have OpenClaw providers configured — providing provider tokens (e.g., Ollama/OpenAI) means any content the skill sends (including full file contents when using --review) will go to those services; don't submit sensitive secrets unless you trust those providers. (2) The provided Bash script is an MVP/placeholder — the actual parallel 'sessions_spawn' logic is not implemented and the advertised safeguards (cost cap, enforced timeouts/rate limits) are largely descriptive, not enforced in the script. (3) There's a small bug/typo in the script (a stray 'n' before the review-file assignment) that may break --review; jq is used but not checked for presence. Recommended actions: review and test the script in a safe environment, ensure jq and openclaw are installed, inspect ~/.openclaw/council-config.json for defaults/presets, and avoid passing sensitive files to the skill unless you have acceptable data-sharing policies with your LLM providers.

Like a lobster shell, security has layers — review code before you run it.

latestvk97f91dmnzghhxk8ed0f1fd3vh8577aw
23downloads
0stars
1versions
Updated 5h ago
v1.0.0
MIT-0

Council of LLMs

Multi-model deliberation for high-stakes decisions. Don't take one model's word for it.

Version: 1.0.0
License: MIT
Author: Wahaj Ahmed

Overview

The Council of LLMs orchestrates structured multi-model debate — routing a single question to multiple LLMs simultaneously, collecting their answers, and surfacing agreements/disagreements. Built for decisions where being wrong costs more than the overhead of multiple perspectives.

Best for: Security audits, architecture decisions, policy analysis, LLM output evaluation
Not for: Quick lookups, casual chat, first drafts

Pre-requisites

  • OpenClaw with multiple LLM providers configured (required)

    • Verify: openclaw status shows 2+ providers
    • Examples: ollama/kimi-k2.5, openai/gpt-4o, anthropic/claude-3-opus
  • Multi-model access (recommended)

    • Local Ollama: Can run 1 model at a time (sequential mode)
    • Recommended: Ollama Cloud — parallel multi-model execution

Installation

Via ClawHub (Recommended)

clawhub install wahajahmed010/council-of-llms

Manual

cd ~/.openclaw/skills
git clone https://github.com/wahajahmed010/council-of-llms.git

Usage

Quick Start (Zero Config)

# Run with built-in sample question
council

# Run with your own question
council "Should we use JWT or session cookies for auth?"

# Security audit example
council --review "Analyze this Python function for security issues" --input ./auth.py

Model Selection

# List available models
council --list-models

# Interactive model selection
council "Architecture decision" --select-models

# Explicit model list
council "Security audit" --models "ollama/kimi-k2.5,openai/gpt-4o,anthropic/claude-3-opus"

# Use specific council preset
council "Code review" --preset security

Configuration

# Sequential mode (for limited hardware)
council "Question" --sequential

# Extended timeout for complex analysis
council "Question" --timeout 180

# Export results
council "Question" --output report.md

How It Works

Architecture

User Question
      ↓
[Pre-flight Check] → Verify 2+ models available
      ↓
[Agent Spawning] → Spawn 2-3 agents with different models
      ↓
[Round 1: Opening] → Each agent provides initial analysis
      ↓
[Round 2: Rebuttal] → Agents respond to each other's points
      ↓
[Synthesis] → Compare positions, find agreements/disagreements
      ↓
[Report] → Structured output with verdict

Fallback Mode

If sessions_spawn is unavailable, the skill automatically switches to single-prompt multi-persona simulation — all "agents" represented as sections in one prompt. Slightly less authentic but works everywhere.

Output Format

# Council Report: [Question]

## Participants
- Strategist (ollama/kimi-k2.5)
- Security Expert (openai/gpt-4o)
- Pragmatist (anthropic/claude-3-opus)

## Individual Positions

### Strategist
**Stance:** JWT with short expiry
**Key Points:**
- Stateless authentication scales horizontally
- Reduces database lookups
- Industry standard for microservices

### Security Expert
**Stance:** Session cookies with httpOnly
**Key Points:**
- XSS protection via httpOnly flag
- Easier revocation on compromise
- No token storage complexity

### Pragmatist
**Stance:** Hybrid approach
**Key Points:**
- Sessions for web, JWT for API
- Best of both worlds
- Implementation overhead worth it

## Agreement Matrix

| Point | Strategist | Security | Pragmatist |
|-------|------------|----------|------------|
| Stateless scaling | ✅ | ⚠️ | ✅ |
| XSS protection | ⚠️ | ✅ | ✅ |
| Revocation ease | ⚠️ | ✅ | ✅ |
| Implementation | ✅ | ✅ | ⚠️ |

## Key Disagreements

1. **Security vs Scalability**: Security Expert prioritizes safety over performance
2. **Complexity**: Strategist sees JWT as simpler; Security Expert sees sessions as simpler

## Synthesis

**Consensus:** Hybrid approach recommended for most teams
**Dissent:** Security Expert maintains pure sessions for high-security contexts
**Confidence:** Medium (genuine disagreement on trade-offs)

## Recommendation

Start with session cookies. Migrate to JWT only if:
- Horizontal scaling becomes bottleneck
- Stateless requirement is critical
- Team has JWT expertise

---
*Generated by Council of LLMs v1.0.0*
*Models: kimik2.5, gpt-4o, claude-3-opus*
*Time: 45s | Tokens: 12,847*

Safeguards

The skill includes automatic protections:

SafeguardDefaultDescription
Timeout per model120sKills slow models, proceeds with others
Cost cap50K tokensHard stop if projection exceeds limit
Max rounds2Prevents infinite deliberation
Model diversityRequiredRejects if all models same provider
Rate limiting10/minPrevents accidental spam
Partial failureContinueWorks even if 1 model fails
Context budget70% windowFails fast before overflow
User opt-inRequiredShows cost estimate before run

Configuration

~/.openclaw/council-config.json:

{
  "default_models": [
    "ollama/kimi-k2.5",
    "openai/gpt-4o",
    "anthropic/claude-3-opus"
  ],
  "timeout": 120,
  "max_tokens_per_model": 8192,
  "cost_warning_threshold": 25000,
  "sequential_fallback": true,
  "output_format": "markdown",
  "presets": {
    "security": {
      "models": ["openai/gpt-4o", "anthropic/claude-3-opus"],
      "system_prompt": "security-expert"
    },
    "architecture": {
      "models": ["ollama/kimi-k2.5", "anthropic/claude-3-opus"],
      "system_prompt": "systems-architect"
    }
  }
}

Limitations

  • Speed: 2-3x slower than single model (parallel helps)
  • Cost: Multiplies by number of models
  • Not for: Simple facts, casual chat, first drafts
  • Diversity is limited: Most models share training data biases

When NOT to Use

  • Simple factual queries (weather, definitions)
  • Real-time applications (chat, support bots) where latency matters
  • Cost-sensitive products with limited API budgets
  • Tasks requiring authoritative, consistent answers (legal, medical — a council of conflicting advice is dangerous)

License

MIT © 2026 Wahaj Ahmed

Comments

Loading comments...