Install
openclaw skills install prompt-debuggerDebug prompts that produce unexpected AI outputs — diagnose failure modes, identify ambiguity and conflicting instructions, test variations, compare model responses, and iteratively improve prompt quality.
openclaw skills install prompt-debuggerWhen a prompt isn't working as expected, systematically diagnose why and fix it. Identifies common failure patterns (ambiguity, conflicting instructions, missing context, wrong format specification), tests variations, and produces an improved version.
Use when: "why isn't this prompt working", "debug my prompt", "improve this prompt", "the AI keeps doing X instead of Y", "prompt not producing expected output", "prompt optimization", or iterating on system prompts.
diagnose — Analyze a Failing PromptGiven a prompt and its undesired output, identify the root cause.
Read the prompt and check for common failure patterns:
Ambiguity Checks:
Conflict Checks:
Context Checks:
Instruction Clarity:
Categorize the issue:
| Failure Mode | Symptoms | Common Fix |
|---|---|---|
| Instruction Following | Ignores specific requirements | Move to top, bold, repeat |
| Format Violation | Wrong output structure | Add explicit format example |
| Hallucination | Makes up facts | Add "only use provided info" |
| Scope Creep | Answers more than asked | Add "only address X, nothing else" |
| Scope Deficit | Answers less than asked | Break into numbered sub-questions |
| Tone Mismatch | Wrong voice/register | Provide tone examples |
| Overthinking | Too verbose/philosophical | Add "be direct, no preamble" |
| Underthinking | Too shallow/generic | Add "think step by step" + require specifics |
| Context Window | Loses early instructions | Repeat key constraints at end |
For each identified issue, propose specific prompt edits:
Issue 1: Ambiguous instruction "analyze the data"
→ Fix: "Analyze the data by calculating the mean, median, and standard deviation for each column. Report any outliers (>2 standard deviations from mean)."
Issue 2: Missing output format
→ Fix: Add "Output format: JSON with keys {summary, findings, recommendations}"
Issue 3: Conflicting constraints
→ Fix: "Prioritize accuracy over brevity. If you must choose between being complete and being concise, be complete."
compare — A/B Test Prompt VariationsGenerate 3-5 variations of a prompt, each targeting a different failure mode fix.
## Variation A: Original (baseline)
[original prompt]
Expected improvement: none (baseline for comparison)
## Variation B: Explicit format
[prompt + format specification]
Target fix: format violation
## Variation C: Role + examples
[prompt + persona + 2 examples]
Target fix: tone mismatch, underthinking
## Variation D: Constraints tightened
[prompt + explicit constraints + negative examples]
Target fix: scope creep, hallucination
## Variation E: Restructured
[reordered prompt with critical instructions first/last]
Target fix: instruction following
For each variation, explain what was changed and why.
rewrite — Produce an Improved PromptApply all identified fixes to produce a single improved prompt.
Rewrite principles:
Before/After format:
### Before
[original prompt — highlight problematic areas]
### After
[improved prompt — annotate what changed and why]
### Changes Made
1. Added role specification ("You are a senior data analyst...")
2. Replaced "analyze" with specific analytical steps
3. Added output format (JSON schema)
4. Moved length constraint to the end (recency)
5. Added negative example ("Do NOT include...")
patterns — Common Prompt Patterns LibraryReference of proven prompt patterns for common tasks:
Chain of Thought:
Think through this step by step:
1. First, identify...
2. Then, analyze...
3. Finally, recommend...
Show your reasoning for each step.
Few-Shot:
Here are examples of the expected output:
Input: [example 1 input]
Output: [example 1 output]
Input: [example 2 input]
Output: [example 2 output]
Now process:
Input: [actual input]
Output:
Constraint Sandwich:
[CRITICAL CONSTRAINTS — read first]
[Main task instructions]
[CRITICAL CONSTRAINTS — repeated for emphasis]
Persona + Task + Format:
You are [specific role] with [specific expertise].
Your task is to [specific action] for [specific audience].
Output as [specific format] with [specific requirements].
Self-Verification:
After generating your response, verify:
- Does it address all N requirements?
- Is it under X words?
- Does it follow the specified format?
If not, revise before outputting.
score — Rate Prompt QualityScore a prompt on multiple dimensions (0-10 each):
| Dimension | Score | Assessment |
|---|---|---|
| Clarity | 7/10 | Instructions are clear but "analyze" is ambiguous |
| Specificity | 4/10 | Missing format, length, audience |
| Completeness | 6/10 | Has context but no examples |
| Consistency | 8/10 | No conflicting instructions |
| Testability | 3/10 | No success criteria defined |
| Overall | 5.6/10 | Needs format spec and examples |
Provide the top 3 improvements that would most increase the score.
anti-patterns — Detect Common Prompt Anti-PatternsScan a prompt for known problematic patterns:
{diagnosis: {issues: [], failure_modes: [], fixes: []}, rewrite: "", score: {}, anti_patterns: []}