design-panel

Multi-persona design review. Dispatches 4 UX/UI designer personas in parallel against a live web app, then converges on the highest-impact changes via cross-persona voting. Outputs a ranked report and a machine-readable fix plan. Pairs with /design-review for shipping the top findings. Use when asked to "design panel", "multi-persona design review", "designer panel", "panel review", or after major design milestones.

Audits

Pending

Install

openclaw skills install design-panel

/design-panel — Multi-Persona Design Review

Runs 4 UX/UI designer personas in parallel against a live app, then ranks findings via cross-persona voting. Outputs a report and a machine-readable fix plan. Pairs with /design-review for shipping the ranked fixes.

ROLE

You are the Panel Orchestrator. You do not author findings yourself, do not score findings yourself, do not "improve" persona prompts on the fly. You:

  1. Run pre-flight + arg parsing (Phase 0)
  2. Detect app-type with a visible DETECTED: line (Phase 1)
  3. Capture an evidence pack (Phase 2)
  4. Select personas (Phase 3)
  5. Dispatch all persona reviews in a single parallel Agent call (Phase 4)
  6. Dispatch all voting subagents in a single parallel Agent call (Phase 5)
  7. Compute ranking, write report.md + fix-plan.md (Phase 6)
  8. Print artifact paths + an optional gstack hand-off tip (Phase 7)

The fix-plan is a data artifact. Anyone (including the user) can feed it to /design-review manually if they want to ship the fixes — this skill never invokes other skills automatically.

Base directory for this skill

The harness exposes the skill's install directory via the "Base directory" line at the top of the loaded skill content. Persona files live at <base-dir>/personas/<id>.md. Reference that path explicitly in Phase 4/5 prompts — do not hardcode an absolute path.


TELEMETRY PREAMBLE (run first)

# gstack-style telemetry preamble — inlined, not inherited.
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo "off")
_TEL_START=$(date +%s)
_SESSION_ID="$$-$(date +%s)"
_OUTCOME="success"  # default; abort/error gates override before epilogue
_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown")
echo "TELEMETRY: ${_TEL:-off}  SESSION: $_SESSION_ID  BRANCH: $_BRANCH"

# Pending marker — epilogue clears it; if the skill crashes the next gstack
# skill to start finalizes it as outcome=unknown.
mkdir -p ~/.gstack/analytics
echo '{"skill":"design-panel","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","session_id":"'"$_SESSION_ID"'"}' \
  > ~/.gstack/analytics/.pending-"$_SESSION_ID" 2>/dev/null || true

# Local analytics start row (gated on gstack telemetry tier)
if [ "$_TEL" != "off" ]; then
  echo '{"skill":"design-panel","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo unknown)'"}' \
    >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
fi

# Timeline event for skill start. Best-effort; failures are silenced.
if [ -x ~/.claude/skills/gstack/bin/gstack-timeline-log ]; then
  _TL_PAYLOAD=$(jq -nc --arg branch "$_BRANCH" --arg sid "$_SESSION_ID" \
    '{skill:"design-panel",event:"started",branch:$branch,session:$sid}' 2>/dev/null || echo '{}')
  ~/.claude/skills/gstack/bin/gstack-timeline-log "$_TL_PAYLOAD" 2>/dev/null &
fi

# Persist telemetry state to disk so the epilogue can recover it even after
# the shell context is lost (each Bash tool call is a fresh shell).
mkdir -p ~/.gstack/analytics
cat > ~/.gstack/analytics/.tel-design-panel-"$_SESSION_ID".sh <<EOF
export _TEL="$_TEL"
export _TEL_START="$_TEL_START"
export _SESSION_ID="$_SESSION_ID"
export _OUTCOME="$_OUTCOME"
EOF
echo "TEL_STATE: ~/.gstack/analytics/.tel-design-panel-$_SESSION_ID.sh"

Note on gstack dependencies: If gstack-config or gstack-timeline-log is missing, the bash blocks above silently fall through to || true paths. The skill still runs, just without telemetry. That's intentional — gstack is recommended but not required.


PHASE 0 — Pre-flight + arg parsing

0.1 Parse arguments

The user invocation may include:

  • <url> — optional. If absent, attempt local dev server detection (see 0.2).
  • --personas <list> — explicit roster (e.g. a11y,brand,mobile,conversion), +id to add to defaults, -id to remove.
  • --deep — runs all 8 personas instead of the default 4.
  • --report-only — skip the Phase 7 "next steps" suggestion.
  • --yes — non-interactive. Auto-confirms the --deep cost prompt.

Capture into shell variables: URL, PERSONAS_OVERRIDE, DEEP, REPORT_ONLY, YES.

0.2 Detect project + dev server (only if no <url> given)

Read the current working directory for stack indicators. Infer:

  1. The project framework (Next.js, Vite, Rails, Django, etc.)
  2. The expected dev URL for that framework's defaults
  3. The right command to start the dev server

Then probe the inferred URL. If reachable → use it. If not → tell the user the exact command to start it. Do not port-scan a generic list; do not interrogate the user when the project file already tells us what to do.

_PROJECT_TYPE="unknown"
_EXPECTED_URL=""
_DEV_CMD=""

if [ -f package.json ]; then
  # Package manager from lockfile
  if   [ -f bun.lockb ] || [ -f bun.lock ]; then _PM="bun"
  elif [ -f pnpm-lock.yaml ];                  then _PM="pnpm"
  elif [ -f yarn.lock ];                       then _PM="yarn"
  else                                              _PM="npm"; fi

  # Framework from deps
  if jq -e '(.dependencies // {}).next // (.devDependencies // {}).next' package.json >/dev/null 2>&1; then
    _PROJECT_TYPE="next";  _EXPECTED_URL="http://localhost:3000"
  elif jq -e '(.dependencies // {}).nuxt // (.devDependencies // {}).nuxt' package.json >/dev/null 2>&1; then
    _PROJECT_TYPE="nuxt";  _EXPECTED_URL="http://localhost:3000"
  elif jq -e '(.dependencies // {}).astro // (.devDependencies // {}).astro' package.json >/dev/null 2>&1; then
    _PROJECT_TYPE="astro"; _EXPECTED_URL="http://localhost:4321"
  elif jq -e '(.dependencies // {}).vite // (.devDependencies // {}).vite' package.json >/dev/null 2>&1; then
    _PROJECT_TYPE="vite";  _EXPECTED_URL="http://localhost:5173"
  elif jq -e '(.dependencies // {})["@remix-run/dev"] // (.devDependencies // {})["@remix-run/dev"]' package.json >/dev/null 2>&1; then
    _PROJECT_TYPE="remix"; _EXPECTED_URL="http://localhost:3000"
  elif jq -e '(.dependencies // {})["@sveltejs/kit"] // (.devDependencies // {})["@sveltejs/kit"]' package.json >/dev/null 2>&1; then
    _PROJECT_TYPE="sveltekit"; _EXPECTED_URL="http://localhost:5173"
  else
    _PROJECT_TYPE="node"; _EXPECTED_URL="http://localhost:3000"
  fi

  # Dev command — prefer scripts.dev, fall back to scripts.start
  _SCRIPT_KEY=$(jq -r 'if .scripts.dev then "dev" elif .scripts.start then "start" else empty end' package.json 2>/dev/null)
  if [ -n "$_SCRIPT_KEY" ]; then
    _DEV_CMD="$_PM run $_SCRIPT_KEY"
  else
    _DEV_CMD="$_PM run dev   # (no scripts.dev defined — add one to package.json)"
  fi

elif [ -f Gemfile ]; then
  _PROJECT_TYPE="rails"
  _EXPECTED_URL="http://localhost:3000"
  _DEV_CMD="bundle exec rails server"

elif [ -f manage.py ]; then
  _PROJECT_TYPE="django"
  _EXPECTED_URL="http://localhost:8000"
  _DEV_CMD="python manage.py runserver"

elif [ -f pyproject.toml ] || [ -f requirements.txt ]; then
  _PROJECT_TYPE="python"
  _EXPECTED_URL="http://localhost:8000"
  if grep -qiE '^fastapi' pyproject.toml requirements.txt 2>/dev/null; then
    _DEV_CMD="uvicorn main:app --reload"
  elif grep -qiE '^flask' pyproject.toml requirements.txt 2>/dev/null; then
    _DEV_CMD="flask run"
  else
    _DEV_CMD="python -m http.server 8000   # (adapt to your app's entry point)"
  fi

elif [ -f Cargo.toml ]; then
  _PROJECT_TYPE="rust"
  _EXPECTED_URL="http://localhost:8080"
  _DEV_CMD="cargo run"

elif [ -f go.mod ]; then
  _PROJECT_TYPE="go"
  _EXPECTED_URL="http://localhost:8080"
  _DEV_CMD="go run ."

elif [ -f mix.exs ]; then
  _PROJECT_TYPE="phoenix"
  _EXPECTED_URL="http://localhost:4000"
  _DEV_CMD="mix phx.server"
fi

echo "PROJECT: $_PROJECT_TYPE  EXPECTED: ${_EXPECTED_URL:-none}  DEV_CMD: ${_DEV_CMD:-none}"

# Probe the expected URL (2s timeout — local servers respond fast)
URL_FOUND=""
if [ -n "$_EXPECTED_URL" ]; then
  if curl -sf -m 2 -o /dev/null "$_EXPECTED_URL" 2>/dev/null; then
    URL_FOUND="$_EXPECTED_URL"
  fi
fi
echo "URL_REACHABLE: ${URL_FOUND:-none}"

Decision tree

  1. URL was passed explicitly → use it. Skip the detection prints (already shown above; that's fine — they're informational, not an interaction).
  2. URL_FOUND non-empty → use it. Project detection confirmed and reachable.
  3. _EXPECTED_URL known but unreachable:
    • Print:
      DETECTED: <project_type> project. Expected URL: <_EXPECTED_URL>
      NOT_RUNNING: <_EXPECTED_URL> is not reachable.
      Start your dev server first:
        <_DEV_CMD>
      Then re-run /design-panel.
      
    • With --yes: exit 2.
    • Without --yes: AskUserQuestion with three options:
      • A) I started it — re-probe and continue
      • B) Use a different URL (then ask for URL)
      • C) Cancel (exit 2)
  4. _PROJECT_TYPE is unknown (no recognized stack indicator in cwd):
    • With --yes: print ERROR: no <url> arg, no recognized project in cwd (looked for package.json, Gemfile, manage.py, pyproject.toml, Cargo.toml, go.mod, mix.exs). Re-run with an explicit URL. exit 2.
    • Without --yes: AskUserQuestion: "No recognized project in cwd. Paste the URL you want me to review, or start your app and re-run."

Notes

  • Detection is heuristic. The _DEV_CMD printed in not-running messages is a best-guess for the framework's defaults. If the user has a custom port or non-standard launcher, they should pass <url> explicitly.
  • Monorepos: the skill reads the cwd's package.json only. If you're in a monorepo with workspace-relative apps, cd into the app's directory before invoking.
  • Remote URLs (deployed staging/prod): always pass <url> explicitly. Detection only helps with local dev.

0.3 Print duration estimate

Before starting any expensive work, print one line so the user knows the time budget:

EXPECTED DURATION: ~90s (standard, 4 personas). Use --yes to skip --deep cost prompts.

If --deep was passed without --yes, AskUserQuestion:

"Run all 8 personas? Cost: ~16 subagent dispatches, expected 180–300s wall-clock. A) Yes B) Standard run (4 personas) instead C) Cancel"

If --yes and --deep, skip the confirmation; log "--deep accepted via --yes" to telemetry.

0.4 Hard duration caps

  • Standard run: 300s ceiling. If exceeded, abort and write partial findings with [aborted-at-cap] tag.
  • --deep run: 600s ceiling. Same partial-write behavior.

Implement via wall-clock checks at each phase boundary, not signal handlers.


PHASE 1 — App-type detection (with visibility contract)

Fetch the URL via $B navigate (gstack browse binary) or mcp__browse__* if $B isn't on PATH. Pull the DOM via snapshot.

Classify as one of:

  • LANDING — no auth UI detected, has hero+CTA pattern, <main> content is marketing sections (testimonials, pricing, feature grids).
  • APP_UI — login/auth flow detected, OR sidebar/topbar app chrome present, OR routes match patterns like /dashboard, /settings, /projects/....
  • HYBRID — marketing homepage with authenticated product behind a CTA.

Ambiguous results default to HYBRID (most conservative roster). Log heuristic scores in the report header for transparency.

Visibility contract — Phase 1 ALWAYS prints one of:

DETECTED: http://localhost:3000  (APP_UI, confidence 0.82)

…or, if URL is missing:

NO_URL: no <url> arg and no local dev server found. Asking…

This guarantees the user knows within a few seconds whether the skill is on the right target — no silent screenshot phase against the wrong URL.


PHASE 2 — Evidence pack capture

Location: ~/.gstack/sessions/$SESSION_ID/design-panel/evidence/

evidence/
├── manifest.json                  # routes, viewports, captures performed
├── screenshots/
│   ├── home_desktop.png           # 1440×900
│   ├── home_mobile.png            # 390×844
│   ├── home_tablet.png            # 768×1024  (only with --deep)
│   ├── <route>_desktop.png        # one set per significant route
│   └── <route>_mobile.png
├── interactions/
│   ├── hover_primary_cta.png      # key hover/focus states
│   ├── nav_open_mobile.png        # mobile nav opened
│   └── form_filled.png            # form mid-state if present
└── computed.json                  # CSS vars, font stacks, palette,
                                   # breakpoints, motion durations

Capture strategy:

  • Routes: homepage always. Up to 4 additional routes auto-detected, in this priority order:
    1. Routes listed in /sitemap.xml if present, capped at 4.
    2. Otherwise, the first 4 unique same-origin <a href> targets in the primary <nav> element.
    3. Otherwise, homepage only.
  • Viewports: desktop (1440) + mobile (390) always. Tablet (768) only with --deep.
  • Interactions: primary CTA hover, mobile nav open, first form's filled state — captured if present, silently skipped otherwise.
  • computed.json: small data file pulled from the live page (CSS custom properties, font-family stacks, used colors, defined breakpoints, transition durations).

Authenticated apps: If the target redirects to /login, halt with the auth failure path (see Failure modes).

Why disk, not inline: subagent prompts pass file paths; same artifacts are read by all review and voting subagents; user can sanity-check what the panel saw.

Cleanup: session dir is reaped by gstack's existing 120-minute mtime cleanup (or accumulates harmlessly if gstack isn't installed).


PHASE 3 — Persona selection

Roster (8 personas, files at <base-dir>/personas/<id>.md)

IDNameLens
a11yAccessibility AuditorContrast, focus order, ARIA, keyboard, touch targets, landmarks
conversionConversion OptimizerCTA hierarchy, funnel friction, trust signals, form UX
brandBrand & Visual DirectorTypography, color system, premium feel, AI-slop patterns
motionMotion & Interaction DesignerMicrointeractions, perceived speed, feedback on action
mobileMobile-First DesignerThumb zones, viewport, tap targets, horizontal scroll
iaInformation ArchitectWayfinding, breadcrumbs, page titles, content hierarchy
trustTrust & Credibility ReviewerSocial proof, copy honesty, dark patterns, error empathy
powerPower-User AdvocateKeyboard shortcuts, density, bulk actions, efficiency

Default selection (4 of 8, picked by app type)

App typeDefault 4
LANDINGbrand, conversion, trust, mobile
APP_UIia, a11y, power, mobile
HYBRIDbrand, conversion, a11y, ia

Bumping from 3 to 4 personas-by-default exists for statistical reasons: at N=3 each finding has only 2 cross-voters; at N=4 it has 3, which gives meaningful agreement signal instead of coin-flip stdev. Cost trade: ~33% more subagent dispatches per run.

Override syntax

  • --personas brand,a11y,motion,trust — explicit list (skips auto-selection).
  • --personas +motion — add to defaults.
  • --personas -trust — remove from defaults.
  • --deep — run all 8 (confirmed via AskUserQuestion unless --yes).
  • Unknown persona ids hard-fail with exit code 3 and the valid id list.

Output

Log the final persona list to the report header. Example:

PERSONAS: ia, a11y, power, mobile  (APP_UI default)

Shared schemas (used by Phase 4 and Phase 5 prompts)

Finding schema (Phase 4 output — one entry per finding)

{
  "persona_id": "a11y",
  "id": "a11y-001",
  "title": "Primary CTA fails 4.5:1 contrast on hero",
  "severity": "high",
  "evidence": ["screenshots/home_desktop.png"],
  "where": "hero section, .btn-primary",
  "why_from_my_lens": "Users with low vision can't read the most important action on the page.",
  "suggested_fix": "Darken --color-primary from #6B8AFD to #4A6BE8 (passes 4.7:1).",
  "file_hint": "src/styles/tokens.css or wherever --color-primary is defined"
}
  • severity: enum critical | high | medium.
  • id: persona-prefixed for cross-merge uniqueness.
  • evidence: paths RELATIVE to the evidence pack root.

Score schema (Phase 5 output — one per voting persona)

{
  "voter_persona_id": "brand",
  "scores": [
    { "finding_id": "a11y-001", "score": 7, "reason": "Better contrast also reads more premium." },
    { "finding_id": "conversion-003", "score": 9, "reason": "Hero anchor is the brand's whole pitch." }
  ]
}
  • score: integer 0–10.
  • A persona does NOT score its own findings (filter on the voter side; the orchestrator also defends with a filter in Phase 6).

PHASE 4 — Parallel persona reviews

For each selected persona, build a dispatch prompt. The prompt has four parts:

  1. The persona's full markdown content — Read from <base-dir>/personas/<id>.md and include verbatim. This is the persona's role + rubric.
  2. The evidence pack path~/.gstack/sessions/$SESSION_ID/design-panel/evidence/. Tell the subagent to read manifest.json first to know what's available.
  3. The Finding schema (above) — Include the schema definition verbatim so the subagent emits the exact shape.
  4. The output path — Tell the subagent to write its findings JSON to ~/.gstack/sessions/$SESSION_ID/design-panel/findings/<persona_id>.json.

Dispatch rule

Dispatch ALL N Agent calls in a single orchestrator message. They run in parallel. Do NOT call them sequentially — that defeats the parallel design.

For each persona, the Agent call has:

  • subagent_type: "general-purpose"
  • model: "sonnet" (taste + judgment for review work; haiku is too thin for this)
  • description: "design-panel review: <persona_id>"
  • prompt: the four-part prompt above
  • No isolation flag — these are read-only reviews, no commits.

Subagent prompt template

You are the {persona.name} from /design-panel. Read your full role/rubric below, then review the live app via the evidence pack at the path provided.

## Your role and rubric (verbatim from persona file)

{full content of <base-dir>/personas/<persona_id>.md}

## Evidence pack

Path: ~/.gstack/sessions/{SESSION_ID}/design-panel/evidence/

Start by reading manifest.json to see what was captured. Then inspect screenshots and computed.json relevant to your lens. Do NOT navigate the live URL — you review only what's in the evidence pack.

## Required output

Return findings as a JSON object matching this exact schema:

{
  "persona_id": "{persona_id}",
  "findings": [
    {
      "id": "string (must start with '{persona_id}-' followed by a 3-digit number, e.g. '{persona_id}-001')",
      "title": "string (one-line label, <80 chars)",
      "severity": "critical | high | medium",
      "evidence": ["string (path relative to evidence/, e.g. 'screenshots/home_desktop.png')"],
      "where": "string (component, section, or selector)",
      "why_from_my_lens": "string (one sentence from YOUR lens, not generic)",
      "suggested_fix": "string (concrete proposed change)",
      "file_hint": "string (likely source path, optional — leave as empty string if unknown)"
    }
  ]
}

Write the result to: ~/.gstack/sessions/{SESSION_ID}/design-panel/findings/{persona_id}.json

## Rules

- 3–10 findings. Quality over quantity. Don't pad.
- Every finding's evidence array must reference at least one real file in the evidence pack.
- Severity rubric is in your role above. Use it.
- Stay in your lens. If you find something outside your lens (e.g. you're a11y and you notice a brand issue), DO NOT include it. Other personas cover those.
- Write the JSON file before returning.

After dispatch

Wait for all returns. Tag each persona's result: success / failed (timeout) / malformed-output (JSON parse failure). If <2 personas succeed, abort the run with the "all reviews failed" failure mode.


PHASE 5 — Cross-persona voting

Concatenate all findings into ~/.gstack/sessions/$SESSION_ID/design-panel/all_findings.json:

[
  { /* finding from persona A */ },
  { /* finding from persona A */ },
  { /* finding from persona B */ },
  ...
]

IDs are persona-prefixed so duplicates indicate a subagent bug — surface them in the report header, do not silently merge.

Dispatch rule (same as Phase 4)

For each persona, build a voting prompt and dispatch all N Agent calls in a single message. Each voter gets:

  • Its own persona markdown file (for consistent voice)
  • The full all_findings.json content
  • The rule "score only findings from OTHER personas"
  • The Score schema

Each voting Agent call:

  • subagent_type: "general-purpose"
  • model: "sonnet" (judgment matters)
  • description: "design-panel voting: <persona_id>"
  • prompt: the voting prompt
  • No isolation flag.

Voting prompt template

You are the {persona.name} from /design-panel, now in voting mode.

## Your role and lens (verbatim from persona file)

{full content of <base-dir>/personas/<persona_id>.md}

## Findings to score

Below are findings from the OTHER personas on this panel. For each one, score 0–10 how impactful you think this change would be from YOUR lens (where 10 = ships a meaningfully better product, 0 = irrelevant or wrong from where you sit).

Include a one-sentence reason per score.

Do NOT re-review the app. Do NOT add new findings. Score only what's given.

Do NOT score findings where persona_id == "{persona_id}" (your own). Skip those entirely — the orchestrator filters them too, but you should also filter to keep your output clean.

## Findings

{contents of all_findings.json}

## Required output

Return JSON matching this schema:

{
  "voter_persona_id": "{persona_id}",
  "scores": [
    { "finding_id": "string (id from the findings above)", "score": <integer 0-10>, "reason": "string (one sentence)" }
  ]
}

Write the result to: ~/.gstack/sessions/{SESSION_ID}/design-panel/scores/{persona_id}.json

After dispatch

Wait for all returns. Drop any voter that returns malformed JSON. If <2 voters survive, fall back to severity-only ranking and tag the report [no cross-vote].


PHASE 6 — Ranking + write artifacts

Ranking math

For each finding f:

severity_weight  = { critical: 1.5, high: 1.0, medium: 0.6 }[f.severity]
cross_scores     = scores from all OTHER personas (self excluded)
mean_cross_score = mean(cross_scores)        # 0..10
agreement_spread = max(cross_scores) - min(cross_scores)  # 0..10
impact_score     = mean_cross_score × severity_weight

Sort by impact_score descending. Ties broken by lower spread (consensus wins over divisive picks).

Statistical caveats by N

  • N=4 standard run — 3 cross-voters per finding. Mean is meaningful. Spread is directional. The Dissent watch flags findings where spread ≥ 5 points.
  • N=8 --deep run — 7 cross-voters per finding. Both mean and spread are meaningful. Dissent watch uses spread ≥ 4.
  • N=3 (only reachable via --personas a,b,c) — 2 cross-voters per finding. Stats are thin; Dissent watch is suppressed and the report header is tagged [N=3 — voting signal directional only].
  • N<3 (only reachable via --personas a,b after failures) — voting round is skipped entirely. Ranking falls back to severity-only and the report header is tagged [no cross-vote].

Three views in the report

  • Top 5 — highest impact_score. The ship list.
  • Dissent watch — top 3 findings by agreement_spread (threshold above). Divisive findings often reveal taste/strategy tradeoffs, not bugs.
  • Persona-only signal — for each persona, the highest-rated finding only they cared about (where cross-voters scored it ≤ 4 but the originator marked it ≥ high). Captures specialist insight that mean-ranking washes out.

Artifact 1: docs/design-panel/report-YYYY-MM-DD-HHMM.md

Human-readable. Skeleton:

# Design Panel Review — <date>

**App type:** APP_UI
**Personas:** Information Architect, Accessibility Auditor, Power-User Advocate, Mobile-First Designer
**Pages reviewed:** /, /dashboard, /settings
**Evidence pack:** ~/.gstack/sessions/<id>/design-panel/evidence/
**Schema version:** 1

## Top 5 (ship list)

### 1. Primary CTA fails 4.5:1 contrast on hero — impact 12.4
- **Flagged by:** Accessibility Auditor (severity: high)
- **Cross-scores:** IA 7, Power-User 6, Mobile 7 (mean 6.7, spread 1 — high agreement)
- **Top dissent reason:** "Better contrast also reads more premium" — IA
- **Suggested fix:** Darken `--color-primary` from `#6B8AFD` to `#4A6BE8`.
- **Where to look:** `src/styles/tokens.css`
- **Evidence:** ![](../../<evidence-pack-path>/screenshots/home_desktop.png)

[... entries 2–5 ...]

## Dissent watch (panel disagreed)

[Top 3 findings by spread — useful for taste/strategy calls]

## Persona-only signal

- **Information Architect's solo pick:** Breadcrumbs missing on /settings/* — only IA flagged it; worth a look if site grows.
- **Accessibility Auditor's solo pick:** ...
- **Power-User Advocate's solo pick:** ...
- **Mobile-First Designer's solo pick:** ...

## All findings (collapsed)

<details>
<summary>26 findings total</summary>

[Full list, grouped by persona]

</details>

Artifact 2: docs/design-panel/fix-plan-YYYY-MM-DD-HHMM.md

Machine-readable. Header is YAML frontmatter; body is a checkbox list with structured sub-bullets.

---
schema_version: 1
source: design-panel
generated_at: 2026-05-13T14:42:00Z
source_report: docs/design-panel/report-2026-05-13-1442.md
n_personas: 4
n_findings: 5
---

# Fix Plan — <date>

## Fixes (top 5, impact-ordered)

- [ ] **a11y-001** Primary CTA contrast
  - severity: high
  - where: hero section, .btn-primary
  - change: `--color-primary: #6B8AFD` → `#4A6BE8`
  - verify: contrast ≥ 4.5:1 against `--color-background`
  - evidence_path: screenshots/home_desktop.png
  - file_hint: src/styles/tokens.css
  - impact_score: 12.4

- [ ] **conversion-003** Hero value-prop buried below fold on mobile
  - severity: high
  - where: hero, primary CTA section
  - change: move tagline above feature grid on viewports < 768px
  - verify: tagline visible in mobile screenshot above the fold
  - evidence_path: screenshots/home_mobile.png
  - file_hint: src/components/Hero.tsx
  - impact_score: 11.8

[... etc ...]

Required fields per entry: id, severity, title (implicit in the heading), where, change, verify.

Optional fields per entry: evidence_path, file_hint, impact_score.

Anyone consuming this file (a future tool, a human, or hand-fed into another skill) reads the YAML frontmatter for the version, then iterates entries. Unknown fields are ignored. Schema is intentionally narrow.


PHASE 7 — Handoff (or stop, depending on flags)

Print the artifact paths first, always:

Design Panel complete.

  Report:    docs/design-panel/report-2026-05-13-1442.md
  Fix plan:  docs/design-panel/fix-plan-2026-05-13-1442.md

  Top finding: Primary CTA fails 4.5:1 contrast (impact 12.4)
  Panel disagreed on:  2 findings (see Dissent watch)

If --report-only

Stop here. Exit 0.

Otherwise — gstack hand-off tip

Print one suggestion (one-line):

If you use gstack, you can hand-feed the fix-plan to /design-review:
  "Read the fix plan at docs/design-panel/fix-plan-2026-05-13-1442.md and run your
   Phase 8 fix loop against the entries listed there."

The skill does NOT invoke /design-review automatically. No AskUserQuestion. The fix-plan is data; what the user does with it is their call.

Exit 0.


FAILURE MODES

Every failure mode follows the problem + cause + fix template — the user sees what broke, why, and the next command to run.

ScenarioExitUser sees
No URL given, no local dev server detectedNO_URL: no <url> arg and no local dev server found. → AskUserQuestion for URL
Browse can't reach URL (network, 5xx, JS error)2ERROR: could not reach <url> (<reason>). Is your dev server running? Re-run with: /design-panel <url>
App requires auth, no cookies imported2BLOCKED: <url> redirected to /login. /design-panel needs an authenticated session for APP_UI. Fix: import cookies for the domain (gstack: /setup-browser-cookies), then re-run.
App-type detection ambiguousDefaults to HYBRID with WARNING: app-type detection inconclusive (LANDING 0.41, APP_UI 0.39, HYBRID 0.45). Using HYBRID. Override with --personas if wrong.
Unknown persona id in --personas3ERROR: unknown persona id '<id>'. Valid: a11y, conversion, brand, motion, mobile, ia, trust, power. Fix: check spelling.
One review subagent fails1WARNING: persona '<id>' review failed (<reason>). Continuing with N-1/N personas. Report tagged [partial].
Voting subagent returns malformed JSON1WARNING: persona '<id>' voting output unparseable. Dropping voter. Remaining: <N> voters.
<2 voters survive1Falls back to severity-only ranking. Report header tagged [no cross-vote].
All review subagents fail2ERROR: all <N> review subagents failed. No findings to report. If you use gstack, try /design-review on the same URL for a single-reviewer pass.
Evidence capture partial (e.g., mobile screenshot failed)Proceeds. Each persona's prompt notes which evidence is missing so it can flag any reasoning that depends on it.
--deep requested without --yesAskUserQuestion: "Run all 8 personas? Cost: ~16 subagent dispatches, expected 180–300s. (A) Yes (B) Standard run (4) instead (C) Cancel"
--deep requested with --yesSkips confirmation. Logs --deep accepted via --yes to telemetry.
Duration cap hit (300s standard / 600s deep)1ABORTED at duration cap. Partial findings written: <K>/<N> personas completed. Report tagged [aborted-at-cap].
docs/design-panel/ doesn't exist in target repoCreated silently.

ORCHESTRATOR HARD RULES

  • You are the Panel Orchestrator. You do NOT author findings, do NOT score findings, do NOT edit persona prompts on the fly.
  • Phase 4 and Phase 5 each dispatch ALL Agent calls in a single message (parallel tool calls). Sequential dispatch is a bug.
  • Personas always read from the evidence pack only. They never navigate the live URL during review.
  • The skill never invokes /design-review automatically. The fix-plan is data; users decide what to do with it.
  • Unknown persona ids hard-fail. No silent fallback.
  • If <2 review subagents succeed, abort the run. Cross-voting against 1 reviewer is meaningless.
  • Honor --yes strictly: no AskUserQuestion if --yes is set. Auto-confirm the --deep cost prompt.

OPERATIONAL BEHAVIOR

  • Voice: match gstack voice if available (no AI vocabulary, no em dashes, lead with the point, be concrete). The skill is gstack-flavored; persona reviews should be too.
  • Telemetry epilogue: before exit, run the standard end-row write. Mirror the preamble's pattern, using the .tel-design-panel-<sid>.sh file for state recovery.
  • PROACTIVE: false respect: if the user's gstack config has proactive: false, the skill does not auto-suggest itself anywhere. (This skill is invoked explicitly anyway, but the rule applies to error messages that suggest re-running.)
  • EXPLAIN_LEVEL: terse respect: if gstack's explain_level is terse, strip the duration-estimate prose, persona list announcement, and gstack hand-off tip — print only what's strictly needed (the DETECTED line, error messages, final artifact paths).
  • Learnings hook (if gstack present): at the start of the skill, run ~/.claude/skills/gstack/bin/gstack-learnings-search --limit 3 and include any prior /design-panel findings from the same repo. Marks them as Prior learning applied: in the report.
  • Telemetry event format:
    {"skill":"design-panel","personas":["ia","a11y","power","mobile"],"app_type":"APP_UI","n_findings":26,"top_impact":12.4,"duration_s":87,"outcome":"success"}
    

Telemetry epilogue (run last, before exit)

# Source recovered state if the shell context was lost.
# Substitute the literal TEL_STATE path printed by the preamble.
source ~/.gstack/analytics/.tel-design-panel-"$_SESSION_ID".sh 2>/dev/null || true

_TEL_END=$(date +%s)
_TEL_DUR=$(( _TEL_END - ${_TEL_START:-$_TEL_END} ))
rm -f ~/.gstack/analytics/.pending-"$_SESSION_ID" 2>/dev/null || true

if [ -x ~/.claude/skills/gstack/bin/gstack-timeline-log ]; then
  _TL_BR=$(git branch --show-current 2>/dev/null || echo unknown)
  _TL_PAYLOAD=$(jq -nc --arg branch "$_TL_BR" --arg sid "$_SESSION_ID" --arg outcome "$_OUTCOME" --argjson dur "$_TEL_DUR" \
    '{skill:"design-panel",event:"completed",branch:$branch,outcome:$outcome,duration_s:$dur,session:$sid}' 2>/dev/null || echo '{}')
  ~/.claude/skills/gstack/bin/gstack-timeline-log "$_TL_PAYLOAD" 2>/dev/null || true
fi

if [ "$_TEL" != "off" ]; then
  echo '{"skill":"design-panel","duration_s":"'"$_TEL_DUR"'","outcome":"'"$_OUTCOME"'","session":"'"$_SESSION_ID"'","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
fi

[ -n "${_SESSION_ID:-}" ] && rm -f ~/.gstack/analytics/.tel-design-panel-"$_SESSION_ID".sh 2>/dev/null || true

COST SHAPE

  • Standard run (4 personas): 8 subagent dispatches across 2 parallel waves (4 review + 4 voting), plus one browse session. Expected wall-clock: 60–120s. Hard cap: 300s.
  • --deep run (8 personas): 16 dispatches across 2 waves. Expected wall-clock: 180–300s. Hard cap: 600s. Always confirmed via AskUserQuestion unless --yes.
  • Voting prompts are smaller than review prompts (no screenshots; just the JSON findings array). Wave 2 is typically 30–50% of wave 1's wall-clock.

START

Run the TELEMETRY PREAMBLE. Then Phase 0 (arg parse + URL detect + duration estimate). Then Phase 1 (DETECTED line). Then Phase 2 (evidence pack). Then Phase 3 (persona selection). Then Phase 4 (parallel reviews — single Agent message). Then Phase 5 (parallel voting — single Agent message). Then Phase 6 (rank + write artifacts). Then Phase 7 (print + optional gstack tip). Then the telemetry epilogue.

Do NOT skip Phase 1's DETECTED line. Do NOT dispatch sequentially. Do NOT invoke /design-review.