control-mirror

v1.1.0

Audits agent and system architecture through a control theory lens for stability, feedback, noise, delay, oscillation, error control, adaptive behavior, and...

1· 133·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for xb19960921/control-mirror.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "control-mirror" (xb19960921/control-mirror) from ClawHub.
Skill page: https://clawhub.ai/xb19960921/control-mirror
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install control-mirror

ClawHub CLI

Package manager switcher

npx clawhub@latest install control-mirror
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the SKILL.md: the skill is explicitly an engineering-cybernetics style architecture review. It declares no binaries, env vars, or installs — all proportional to an instruction-only review skill.
Instruction Scope
Runtime instructions stay within the stated domain (mapping controller, sensors, feedback, delays, etc.) and do not instruct the agent to read system files, exfiltrate data, or call external endpoints. The SKILL.md asks the reviewer to examine the included reference files (which are bundled with the skill) — that is expected for an instruction-only review skill.
Install Mechanism
No install spec and no code files are present, so nothing is downloaded or written to disk by the skill itself (lowest-install risk).
Credentials
The skill requires no environment variables, credentials, or config paths. There are no unexplained secrets or unrelated service credentials requested.
Persistence & Privilege
Flags are normal: always=false and user-invocable=true. disable-model-invocation=false (agent-autonomy allowed) is the platform default; this is not problematic here because the skill has no privileged requirements. If autonomous invocation were combined with other red flags it would deserve closer scrutiny.
Assessment
This skill appears coherent and low-risk because it is instruction-only and asks for no installs or credentials. Before using it: avoid pasting sensitive secrets, credentials, or complete raw logs into the prompt (redact or summarize architecture elements instead); prefer to provide high-level diagrams, metrics, and anonymized examples. Note the skill's source is unknown — if you plan to run reviews at scale or let agents invoke the skill autonomously, consider validating the author or maintaining a vetted internal copy of the guidance. If you want greater assurance, request provenance (author/contact) or run the prompts on non-sensitive example systems first.

Like a lobster shell, security has layers — review code before you run it.

latestvk974cehcvv08efk8r0p6wmptfs85g2wk
133downloads
1stars
3versions
Updated 3d ago
v1.1.0
MIT-0

Control Mirror

Control Mirror turns the core ideas of Qian Xuesen's Engineering Cybernetics into a practical architecture review skill.

It is not a generic architecture review checklist. It treats a system as a controlled dynamic system and asks one brutal question:

Is this system becoming more controllable, stable, observable, and adaptive — or merely more complicated?

Use it as three tools at once:

  • Mirror: reveal hidden instability, noise, delay, positive feedback, and uncontrolled loops.
  • Ruler: measure whether the architecture has real controllability, not just many modules.
  • Compass: identify the next highest-leverage evolution step toward a self-stabilizing system.

When to Use

Use this skill when the user wants to review, diagnose, or evolve any system architecture, especially:

  • software platforms, agent systems, workflow engines, automation systems, data pipelines, product operating systems
  • multi-agent systems, AgentOS, LLM tool-use systems, memory systems, routing layers, evaluation pipelines
  • systems that repeatedly fail, oscillate, drift, over-consume resources, or require constant human supervision
  • architectures with unclear feedback loops, weak observability, noisy memory/context, delayed correction, or unsafe automation
  • reviews involving: control theory, feedback, stability, damping, adaptation, error correction, delay, noise, self-stabilization, cybernetics

Do not use this as a generic “is the code clean?” review. Use it when the key question is system behavior over time.


Core Lens

Do not start by asking “how many features does the system have?”

Start with:

  1. What is the reference input / target state?
  2. What acts as the controller?
  3. What is the controlled object / environment?
  4. What sensors observe the system state?
  5. What feedback changes future behavior?
  6. Where are delay, noise, saturation, and error accumulation introduced?
  7. What prevents unstable positive feedback?
  8. Can the system converge after disturbance?

If you cannot map these, the review is still too superficial.


Engineering Cybernetics Mapping

Map the user's architecture into this structure:

Cybernetics conceptArchitecture equivalent
Reference inputUser goal, SLA, product target, policy, task objective
ControllerScheduler, orchestrator, agent, workflow engine, governance layer
Controlled objectCodebase, service, team process, model provider, external environment
ActuatorTool calls, deployments, writes, API calls, workflow transitions
SensorTests, logs, metrics, review, cost reports, human feedback, evals
FeedbackSignals that change later routing, budget, memory, workflow, or policy
NoiseIrrelevant context, stale memory, bad retrieval, flaky logs, misleading metrics
DelayAsync queues, slow tools, stale state sync, delayed human review
ErrorDifference between target and actual state
DampingRate limits, compression, budget caps, confirmations, backoff, gates
SaturationToken limits, API limits, budget limits, human attention limits
Stability boundaryConditions where automation must pause, degrade, or ask for confirmation

Review Workflow

Phase 1: Define the Control Loop

Produce a concise control-loop map:

Target → Controller → Actuator → Controlled object → Sensor → Feedback → Controller

Then identify:

  • feedback frequency
  • feedback quality
  • delay points
  • noise sources
  • constraints / saturation points
  • human checkpoints

Phase 2: Detect Instability Patterns

Look for the highest-risk 3-5 patterns only. Do not dump a huge checklist.

1. Open-loop execution

Symptoms:

  • system executes but does not verify
  • failures do not change future strategy
  • humans discover errors only after damage

Control diagnosis: the system lacks effective feedback.

2. Positive feedback amplification

Symptoms:

  • bad memory causes worse retrieval, which writes more bad memory
  • failing retries increase cost and context noise
  • low-quality outputs feed later decisions without validation

Control diagnosis: error is amplified instead of damped.

3. Oscillation

Symptoms:

  • repeated searching, repeated rewriting, repeated re-planning
  • agents hand tasks back and forth
  • workflow loops without new evidence

Control diagnosis: feedback exists, but damping and convergence criteria are weak.

4. Delay-induced correction failure

Symptoms:

  • async feedback arrives after the system has already moved on
  • state is stale when decisions are made
  • human approval comes too late to prevent error propagation

Control diagnosis: the feedback loop has harmful latency.

5. Noise pollution

Symptoms:

  • irrelevant context dominates current task signal
  • stale memory is treated as truth
  • logs are too verbose to reveal failure causes

Control diagnosis: the sensor layer lacks filtering.

6. Error accumulation

Symptoms:

  • small deviations silently compound
  • no checkpoint validates intermediate state
  • rollback is missing or expensive

Control diagnosis: there is no bounded-error mechanism.

7. Fake adaptation

Symptoms:

  • system looks flexible, but every adjustment is manual
  • policies do not update from observed outcomes
  • “AI decides” but cannot explain or revise its decision rule

Control diagnosis: adaptation is not closed-loop.

8. Saturation blindness

Symptoms:

  • token/cost/API/human attention limits are treated as infinite
  • system fails only after hitting hard limits
  • no graceful degradation exists

Control diagnosis: constraints are not part of the controller.

Phase 3: Measure Real Strengths

Only count an architectural strength if it improves controllability.

Good strengths include:

  • feedback that actually changes future behavior
  • tests/reviews that gate irreversible actions
  • cost/token/resource constraints that affect routing
  • memory that is filtered, layered, and reversible
  • observability that exposes why decisions were made
  • fallback and degradation paths
  • explicit pause/confirm conditions for high-risk actions

Do not praise “many modules”, “many agents”, or “complex workflows” unless they improve stability.

Phase 4: Score Control Maturity

Use this 0-5 scale:

LevelNameMeaning
0Open loopExecutes without reliable feedback
1Human-corrected loopHumans catch and correct most errors
2Verified loopTests/reviews/metrics gate some actions
3Damped closed loopBudgets, gates, retries, and fallbacks reduce instability
4Adaptive closed loopHistorical outcomes tune future routing/policy/budget
5Self-evolving controlled systemThe system improves its own procedures while preventing drift and pollution

Give a level with one sentence of evidence.

Phase 5: Recommend Evolution

Prioritize only 1-3 actions.

Use this priority logic:

  • P0: Prevent loss of control — add gates, rollback, confirmation, error bounds, loop detection.
  • P1: Improve self-stabilization — add damping, filtering, degradation, better observability.
  • P2: Improve adaptation — use historical outcomes to tune policies, promote SOPs, evolve workflows.

Control Scorecard

When the user wants a rigorous review, include this table:

DimensionScore / 5What to check
Feedback completenessDoes output affect future decisions?
Stability / convergenceDoes the system stop oscillating and converge?
Noise filteringAre irrelevant/stale signals filtered before decisions?
Delay handlingAre stale/late signals detected and handled?
Error controlAre small errors bounded before they cascade?
Damping / resource controlAre cost/token/API/human limits part of control?
ObservabilityCan humans see why decisions were made?
AdaptationDo historical outcomes tune future behavior?
Safety boundaryDoes automation pause on irreversible/high-risk actions?

Then summarize:

Control maturity level: L0-L5
Strongest control loop: ...
Weakest control loop: ...
Highest-risk instability: ...
Next P0 action: ...

Output Format

Use this default structure:

1. Control-system mapping

One concise paragraph or diagram mapping the system into controller / controlled object / feedback / noise / delay / constraints.

2. Architecture problems — Mirror

List the most important 3-5 problems.

For each:

  • Problem: what is unstable or uncontrolled
  • Control diagnosis: why this is a feedback/noise/delay/error issue
  • Consequence: what happens if it remains unfixed

3. Architecture strengths — Ruler

Only list strengths that improve controllability, stability, observability, or adaptation.

4. Evolution direction — Compass

Give 1-3 prioritized actions:

  • priority
  • action
  • why it comes first
  • how to verify it worked

5. Control scorecard

Include when the review is non-trivial or when the user asks for scoring.


Domain-Specific Review Prompts

Software platform / microservices

Check:

  • service health signals → routing / scaling / rollback
  • alert delay and alert fatigue
  • retry storms and cascading failures
  • circuit breakers and backpressure
  • deployment rollback and blast-radius control

Data pipeline / analytics system

Check:

  • data quality sensors
  • late-arriving data handling
  • schema drift detection
  • bad data quarantine
  • downstream contamination recovery

Workflow / operations system

Check:

  • whether each step has a gate
  • whether failed steps loop back to the right point
  • whether work-in-progress creates hidden queues
  • whether handoffs introduce delay or noise
  • whether “done” has measurable criteria

Agent / LLM system

Check:

  • tool calls are verified, not assumed
  • memory is layered and filtered
  • retrieval errors do not poison future context
  • model routing considers complexity, cost, latency, health, and outcome history
  • long outputs are compressed before entering context
  • high-risk writes/actions require confirmation

Organization / socio-technical system

Check:

  • decision feedback is timely enough to matter
  • local team incentives do not amplify global instability
  • metrics do not become noisy proxies
  • governance prevents irreversible mistakes without freezing execution

AgentOS / Multi-Agent Extension

Use this section only when reviewing AgentOS, multi-agent platforms, OpenClaw-like systems, LLM orchestration, or autonomous workflow engines.

AgentOS control-loop map

User goal / task queue
  → execution kernel / orchestrator
  → complexity scoring + routing policy
  → model / agent / tool execution
  → tests / review / cost / logs / user feedback
  → memory / metrics / SOP candidates
  → next routing, workflow, budget, or policy decision

AgentOS-specific questions

  • Is there a single default execution entrypoint, or can components bypass the control loop?
  • Does model routing use complexity, budget, health, latency, historical success/failure, and user preference?
  • Are high-token outputs such as test logs, build logs, git diffs, search/list/tree outputs compressed and measured?
  • Do workflow templates have gates and failure loops, or are they only diagrams?
  • Does memory separate runtime observations, task summaries, failure cases, SOP candidates, and long-term wiki notes?
  • Do failures and costs tune future behavior, or are they only recorded?
  • Are irreversible writes, public actions, deletions, deployments, and over-budget actions bounded by confirmation?

AgentOS scorecard add-on

DimensionScore / 5Deduct when...
Default entrypoint unityAPIs/tools bypass the kernel/orchestrator
Model routing adaptivenessrouting is keyword-only or ignores history/cost/health
Token dampingnoisy outputs enter context uncompressed
Workflow gate strengthsteps can advance without required evidence
Memory pollution resistancetemporary noise is written into long-term memory
Feedback-to-policy loopfailures/costs are logged but do not affect future decisions
Explainabilitydecisions lack request id, factors, or human-readable reason
Safety boundaryhigh-risk automation has no pause/confirm path

Common AgentOS instability patterns:

  • Kernel bypass: multiple execution paths ignore the main controller.
  • Feedback as logs only: sensors exist, but do not affect control.
  • Memory writeback overheating: every observation becomes long-term truth.
  • Formal workflow gates: steps exist, but missing evidence does not block progress.
  • Unobservable damping: compression/budgeting exists, but savings and loss are not measurable.

Anti-Patterns

Avoid these mistakes:

  • Turning the review into a theoretical control theory lecture.
  • Praising complexity as sophistication.
  • Listing every possible issue instead of the few that most affect control.
  • Suggesting adaptation before the system has basic stability.
  • Ignoring human attention and budget as hard constraints.
  • Treating logs as feedback when they do not change future decisions.
  • Treating memory as knowledge without filtering, promotion, or rollback.

Quality Bar

A good Control Mirror review must be:

  • Mapped: clear controller / object / sensor / feedback structure.
  • Diagnostic: names the control failure, not just the symptom.
  • Prioritized: gives P0/P1/P2, not a flat wishlist.
  • Verifiable: each recommendation has a way to check completion.
  • Grounded: tied to the actual architecture, code, workflow, or process.

If the output does not help the user make the system more controllable, stable, observable, or adaptive, it failed.

Comments

Loading comments...