Install
openclaw skills install b143kc47-first-principles-thinking-v2Reason from fundamentals with in-session claim ledgers, mechanism maps, assumption checks, Fermi estimates, evidence grounding, verification questions, red-team critique, and structured brainstorming. Use for architecture, system design, technology selection, debugging, performance, migrations, strategy, product, research, scientific or business decisions, or prompts like first principles, challenge assumptions, from scratch, brainstorm, or think from fundamentals.
openclaw skills install b143kc47-first-principles-thinking-v2Break problems down to fundamental truths, then rebuild solutions from the ground up. Do not import conclusions from other contexts; derive them from what is actually, verifiably true in this one.
"To the first basis of the thing -- the first from which a thing is known." -- Aristotle, Metaphysics V.1
This skill is a reasoning guide only. It ships Markdown instructions and reference examples; it has no helper program, installer, credential requirement, network endpoint, or automatic state mechanism. Keep claim ledgers in the active conversation or in user-provided pasted context. Use external tools only when the runtime already provides them and the user's task itself requires evidence, calculation, or source inspection. Do not run bundled code for this skill.
<HARD-GATE> Do NOT propose a solution, recommend a technology, or start writing code until:The cost of skipping this gate is solving the wrong problem efficiently -- the most expensive failure mode in engineering, strategy, and research. </HARD-GATE>
Activate when the request shows one or more complexity markers:
Stay dormant when the request is small, scoped, and obviously mechanical:
When in doubt between activating and skipping, do not add process overhead. Either run Quick depth silently and produce a compact result, or ask one sentence: "This looks like a design decision -- should I challenge the assumptions first or go straight to implementation?"
State the detected depth at the start; the user can override.
| Level | When | What Runs |
|---|---|---|
| Quick | Medium complexity, manual /fp, reversible choice | Intake + compact ledger + mechanism sketch + one verification check |
| Standard | Architecture, tech selection, design, strategy, product decisions | Intake + Socratic probes + decomposition + mechanism map + reconstruction + verification |
| Deep | System design, hard debugging, high-stakes decisions, /fp deep | All phases including inversion, 2-3 reconstruction paths, sensitivity, self-consistency, verification |
| Exploration | Explicit brainstorming, invention, research framing, /fp brainstorm | Divergent search with ToT / GoT / morphological matrix / contradiction analysis, then convergence and tests |
The seven phases below are the operational form of this compact seven-step loop. Every run, regardless of depth or mode, should be traceable to it:
Every non-trivial problem resolves into one primary mode. Detect it in Phase 1, state it aloud, and let it shape emphasis across the phases. A problem may carry a secondary mode (e.g. diagnosis-then-decision); name that too, but keep one as primary.
| Mode | Use when the user wants to... |
|---|---|
| decision | choose among options (tech, design, vendor, build-vs-buy, staging) |
| diagnosis | explain a symptom, failure, regression, or anomaly |
| planning | get from a current state to a desired state on a sequence of steps |
| critique | stress-test a claim, proposal, argument, belief, or narrative |
| explanation | understand a mechanism, model, or system without deciding anything |
| synthesis | rebuild a messy or multi-frame problem into a single coherent view |
| exploration | generate, expand, combine, and filter non-obvious options or hypotheses |
Default assignment heuristics:
Modes are not pre-built answers; they tell you which phases tighten and which relax. The Mode Playbooks section at the bottom spells out each route.
Use the smallest reasoning stack that can safely answer the problem. More steps are not automatically better; activate heavier tools only when the problem is ambiguous, irreversible, high-stakes, data-dependent, or explicitly brainstorming-oriented.
| Signal | Add this tool |
|---|---|
| Hidden assumptions likely | Assumption Ledger with fragility, failure mode, and fastest test |
| Mechanism or causality matters | Causal / mechanism map: variables, links, confounders, feedback loops |
| Many moving parts | Least-to-most decomposition into smallest solvable subproblems |
| Numbers, capacity, cost, scale, or physical constraints matter | Fermi estimate, dimensional check, low/base/high range |
| Recent, factual, niche, legal, scientific, product, price, or data claims matter | Evidence grounding through approved search, user-provided sources, calculations, or citations |
| Open-ended ideation | Tree of Thoughts, Graph of Thoughts, morphological matrix, contradiction analysis |
| Competing explanations | Self-consistency across 2-3 independent paths and a discriminating test |
| High-stakes or likely hallucination risk | Chain-of-verification, backward check, red team, sensitivity analysis |
| Hard equations, schedules, constraints, optimization, or Boolean logic | Formalize variables and use approved calculation, spreadsheet, solver, or symbolic-math tools when available |
When tools are needed, choose only runtime-approved tools and state them in a short Tool Plan before running the analysis. For Quick depth, the Tool Plan can be one line.
Use these additions whenever Standard, Deep, or evidence-dependent work is
requested. See references/advanced-reasoning-tools.md for detailed prompts.
Use these additions when the mode is exploration, synthesis, early strategy, product ideation, research design, invention, or the user asks for options. Diverge first, converge second.
Phases run in order. Earlier phases may be abbreviated at Quick depth, but never skipped entirely.
Restate the problem in outcome terms, not solution terms. Classify the task into one primary mode (and at most one secondary). If enough context exists, proceed with explicit working assumptions instead of asking for confirmation.
"I read the outcome as: [outcome in one sentence]. Current approach or framing: [current solution idea, if any]. Mode: [mode] (secondary: [mode | none]). Depth: [Quick / Standard / Deep / Exploration]. Working assumption if not corrected: [...]."
Rules:
Probe the problem with the question types below. Pick the most relevant 3-5
for the problem at hand; asking all of them robotically is worse than asking
three well-chosen ones. For the full catalog including probes for each type,
see references/techniques.md.
Clarification -- "What exactly do you mean by X?" / "Concrete example?" / "What does success look like, measurably?"
Assumption Probing -- "Why does it have to work that way?" / "Who decided this, and what was their reasoning?" / "Is this a hard requirement or inherited from a previous design?"
Evidence -- "What data shows this is the bottleneck?" / "Have you measured it, or is it a guess?" / "How do you know users actually need this?"
Alternative Viewpoints -- "How would a team with opposite constraints solve this?" / "What would you build starting from zero today?" / "What would a critic of this approach say?"
Implications -- "What are the second-order effects?" / "What breaks if this assumption is wrong?" / "What's the cost of reversing this decision later?"
Meta -- "Are we solving the right problem?" / "Is this the simplest form of the problem?" / "What happens if we just... don't do this?"
Red-flag phrases that almost always hide an assumption. When you hear these, drop into assumption-probing mode:
Cadence:
Break the problem into atomic components and file every component into the Claim Ledger. The ledger is the canonical record of what you know, what you were told, what you are guessing, what binds you, and what is missing. Nothing downstream -- inversion, reconstruction, verification -- may cite a fact that is not in the ledger.
Before proposing paths, build a compact Mechanism Map:
Then run Least-to-Most Decomposition: write 3-7 smallest solvable subproblems. Each subproblem must yield one intermediate variable, constraint, mechanism claim, risk, or testable unknown before synthesis.
Prior-claim intake (before filling the ledger). This variant keeps prior claims in the conversation. If the user pastes a previous Claim Ledger, provides notes, or supplies connector/source content, import it as candidate context and log the outcome.
Log one of:
Imported claims re-enter as [CLAIM], never as [TRUTH]. They must pass
the Ground-Truth Test below to be promoted back to [TRUTH]. This is the
primary mitigation for anchoring bias; do not skip it even when the imported
claim was previously marked verified. See references/session-ledger-template.md
for the in-session ledger format.
The five ledger lanes:
| Lane | Definition | Tag |
|---|---|---|
| Verified facts | Provable in this context: physics, math, measurement, executable check, stipulation. Passes the Ground-Truth Test. | [TRUTH] |
| Reported claims | Statements from the user, a source, or prior art, not yet verified. Treated as conditional until promoted or rejected. | [CLAIM] |
| Assumptions | Inherited convention, habit, team preference, or unverified belief being used as if it were a truth. | [ASSUMPTION] |
| Constraints | Hard limits the solution must respect: regulatory, contractual, budget, headcount, latency / throughput SLOs, compatibility. | [CONSTRAINT] |
| Unknowns | A fact we'd need but don't yet have. Blocks trust in the decomposition until resolved or bounded. | [UNKNOWN] |
Ground-Truth Test -- before tagging something [TRUTH], ask:
If the answer to any of the three is "no" or "not sure", route it to
[CLAIM], [ASSUMPTION], [CONSTRAINT], or [UNKNOWN] instead. User-
supplied statements start life as [CLAIM]; they are promoted to [TRUTH]
only after the test passes, or downgraded to [ASSUMPTION] if the belief is
being used without verification.
Assumption Ledger v2 -- classify each [ASSUMPTION] and record its risk profile:
| Category | Key Question |
|---|---|
| Technical | "Must this technology / pattern / protocol be used?" |
| Business | "Is this requirement actually fixed, or negotiable?" |
| Resource | "Are these constraints real (budget, headcount) or perceived?" |
| Historical | "Why was this chosen originally? Do those conditions still hold?" |
| Behavioral | "Are we assuming users, teams, markets, or adversaries will behave a certain way?" |
| Data / Evidence | "Are we assuming a measurement, source, benchmark, or sample is representative?" |
For each assumption, capture:
| Field | Required content |
|---|---|
| Evidence | What supports it now, if anything |
| Confidence | low / medium / high |
| Fragility | what would make it break |
| Failure mode | how the final answer fails if it is false |
| Fastest test | cheapest observation, experiment, search, or calculation that would check it |
Constraint discipline: a [CONSTRAINT] must name (a) its source
(regulator, contract, SLO doc, hardware limit), (b) its numeric threshold
where applicable, and (c) the cost of violating it. Unsourced "constraints"
are [ASSUMPTION]s in disguise.
Unknowns discipline: every [UNKNOWN] must state (a) what it is, (b)
how it would be resolved (measurement, document, stakeholder), and (c)
whether the downstream recommendation changes if the resolution lands at
either end of the plausible range. If a recommendation is stable across the
range, the unknown is not blocking.
Recursion rule: if a component reveals its own hidden assumptions (e.g. "we need a message queue" contains "we need async processing"), flag it:
"This sub-problem has its own assumptions. Going one level deeper."
Run Phases 2-3 on the sub-problem, then resume. Maximum recursion depth: 2
levels. If you hit the limit, list the unexplored sub-problem as an
[UNKNOWN] in the ledger.
Invert the question. Instead of "how do I make this succeed?", ask:
"What would guarantee this fails? What must I avoid at all costs?"
List 3-5 failure modes. For each, identify which ground truth or design choice would prevent it. Failure modes the current design does not prevent are risks that must be addressed or accepted explicitly.
Inversion is cheap and catches assumption gaps the forward analysis misses.
Munger's rule: "Invert, always invert." See references/techniques.md for
the inversion playbook.
Build 2-3 candidate solution paths using only the verified ground truths. For exploration mode, build 3-5 paths first, then converge. For each path, state:
[TRUTH]s and [CONSTRAINT]s it is built on[UNKNOWN]s and how they'd be resolvedIf magnitudes matter, add a Fermi / dimensional sanity check before ranking: write the proxy equation, estimate low/base/high values, check units, and name the dominant variable. Evaluate each path on its own merits. The conventional path may win -- but only because the analysis led there, not because it was the default.
Chesterton's Fence check: before recommending the removal of any existing structure (code, system, process), ask why it was built. If you cannot state the original reason and whether the conditions still hold, you do not yet have the right to remove it.
Before handing over the recommendation, stress-test it:
[TRUTH] or
[CONSTRAINT], not another [ASSUMPTION].[UNKNOWN]s remain open and how sensitive the
recommendation is to them.Emit a structured "First Principles Analysis" block. This artifact stays in context and guides all subsequent work in the session.
Quick artifact:
## First Principles Analysis
**Problem (outcome):** [one sentence]
**Mode:** [decision | diagnosis | planning | critique | explanation | synthesis]
**Depth:** Quick
### Tool Plan
[1 line: mechanism map / Fermi check / verification / brainstorm as needed]
### Claim Ledger (compact)
- [TRUTH] [...]
- [CLAIM] [user-supplied, not yet verified]
- [ASSUMPTION] [inherited / conventional + fragility]
- [CONSTRAINT] [hard limit + source]
- [UNKNOWN] [fact needed + how to get it]
### Mechanism Sketch
[Variables -> causal link -> expected outcome; name bottleneck or feedback loop]
### Assumptions Challenged
| Assumption | Challenge | Failure if false | Fastest test | Verdict |
|------------|-----------|------------------|--------------|---------|
| [...] | [...] | [...] | [...] | Keep / Modify / Discard / Investigate |
### Recommended Approach
[Solution with brief reasoning grounded in the ledger above.]
### Verification Check
- Falsifier / backward check / sensitivity note: [...]
Standard / Deep artifact:
## First Principles Analysis
**Problem (outcome):** [one sentence]
**Mode:** [primary] (secondary: [mode | none])
**Depth:** Standard | Deep
### Tool Plan
[Which tools were activated: mechanism map, evidence grounding, ToT/GoT, Fermi, verification, solver]
### Claim Ledger
**Verified facts**
- [TRUTH] [fact + why irreducible]
**Reported claims (user or source, not yet verified)**
- [CLAIM] [statement + who/source asserted it + how it would be verified]
**Assumptions**
- [ASSUMPTION] [convention / habit + category + confidence + fragility]
**Constraints**
- [CONSTRAINT] [limit + source + numeric threshold + cost of violation]
**Unknowns**
- [UNKNOWN] [fact needed + how to resolve + is the recommendation sensitive to it?]
### Mechanism Map
- Variables / actors:
- Causal links:
- Bottlenecks:
- Feedback loops:
- Boundary conditions:
- Confounders / hidden variables:
### Decomposition
| Subproblem | Intermediate output | Status |
|------------|---------------------|--------|
| [...] | [...] | solved / bounded / unknown |
### Assumptions Challenged
| Assumption | Category | Evidence | Confidence | Fragility | Failure if false | Fastest test | Verdict |
|------------|----------|----------|------------|-----------|------------------|--------------|---------|
| [...] | Tech / Biz / Resource / Historical / Behavioral / Data | [...] | low / medium / high | [...] | [...] | [...] | Keep / Modify / Discard / Investigate |
### Inversion (what would guarantee failure)
- [failure mode] -> prevented by [truth / design choice] | NOT prevented -> risk to accept
### Reconstruction
**Path A** -- [name]
- Built on: [TRUTHs / CONSTRAINTs]
- Design choices: [...]
- Trade-offs: [...]
**Path B** -- [name]
- Built on: [TRUTHs / CONSTRAINTs]
- Design choices: [...]
- Trade-offs: [...]
### Recommendation
[Chosen path. Every major choice cites the `[TRUTH]` or `[CONSTRAINT]` that forces it.]
### Strongest Alternative View
[Best objection / competing option / counter-explanation, attributed to the smartest plausible critic. Why the recommendation still survives -- or where it conditionally does not.]
### Quantitative Sanity Check
- Proxy equation / governing relationship:
- Low / base / high estimate:
- Unit check:
- Dominant variable:
### Verification
- Self-consistency: [where independent paths agree/disagree]
- Verification questions: [questions + answers]
- Backward check: [if recommendation is true, what else must be true?]
- Falsifier: [what observation would invalidate this]
- 5-whys trace: [chain of 5 whys bottoming out in a TRUTH / CONSTRAINT]
- Sensitivity: [variables or assumptions that could flip the conclusion]
- Reversibility: [how expensive to back out]
- Confidence: [low / medium / high + which unknowns drive residual uncertainty]
### User Checkpoints
- [Top 1-3 assumptions, facts, or choices the user should confirm, reject, or supply next]
### Open Questions
- [Sub-problems noted but not fully decomposed]
Carry-Forward Ledger Summary (Phase 7 gate). After the artifact is emitted, scan the ledger for items the user may want to reuse in a later discussion. Only [TRUTH] and [CONSTRAINT] lanes are eligible. Produce a text-only summary for human review and stop.
### Carry-Forward Ledger Summary (copy/paste only — human review required)
Scope: <project | module | topic | global-human-notes>
+ [TRUTH] <statement> tags=[...] evidence=<...> revalidate_days=<n|null>
+ [CONSTRAINT] <statement> tags=[...] source=<...> threshold=<...>
~ supersede <old statement or id> with: <new statement> reason=<...>
Before including the summary block, explicitly filter out secrets, credentials, private customer data, unreleased internal plans, regulated data, or confidential business facts. This block is only text for the user to review and copy manually; it is not an instruction to run code or save state.
Every mode runs the same seven phases. The playbooks say which phases to lean on, which sub-steps to insert, and what the ledger and artifact should emphasize. Use them after Phase 1 has fixed the mode.
[CLAIM] about expected behavior.
[CONSTRAINT]s define the feasible set.[UNKNOWN].[CLAIM] until reproduced. Missing-variable
explanations (what did not happen, what was not measured) are
[UNKNOWN]s.[CONSTRAINT]s; unstated prerequisites become [UNKNOWN]s.[TRUTH] or
[CONSTRAINT] that forces its existence.[TRUTH] and [CONSTRAINT]; the
mechanism is the chain that links them.[TRUTH].Watch for these patterns; they indicate reasoning by analogy has crept back in.
"Company X does it this way, so we should too." Check: Are your constraints identical to theirs in every relevant dimension? What did they have that you don't? What do you have that they didn't?
The proposed solution is more elaborate than the problem warrants. Check: Remove one component at a time. If the core outcome still holds without it, that component was not essential. Repeat until removal breaks the outcome. What's left is the minimum viable design.
Maintaining compatibility with decisions that no longer serve the system. Check: What was the original reason for this decision? Do those conditions still exist? What's the true cost of changing vs. the ongoing cost of maintaining the legacy?
"We have X, so every problem looks like an X problem." Check: Would you pick this tool starting fresh today with no sunk cost? Is the tool driving the design, or is the problem driving the tool choice?
"The senior engineer / PM / client said so." Check: Trace the instruction back to the underlying need. The person giving the instruction may be right, but the reasoning must still be reproducible from truths, not from their authority alone.
First-principles reasoning used as an excuse to re-derive everything. Check: If the conventional solution is within 2x of optimal and the team already knows it, use it. First principles pays off most on decisions where conventional wisdom is 10x wrong, not 10% suboptimal.
references/techniques.md -- Reasoning techniques toolbox: full Socratic
catalog, 5 Whys, Inversion playbook, Chesterton's Fence, Falsifiability,
Tree-of-Thoughts branching, Occam's Razor, mechanism mapping, Fermi checks,
sensitivity analysis, verification chains, and structured brainstorming.
Load when you need to pick the right tool for a phase.references/advanced-reasoning-tools.md -- Expanded accuracy and brainstorming
playbooks: ReAct-style evidence grounding, causal graphs, assumption ledger
v2, least-to-most decomposition, self-consistency, chain-of-verification,
Graph of Thoughts, morphological analysis, contradiction analysis, and
multi-perspective debate. Load for Deep or Exploration depth.references/examples.md -- Four worked engineering examples (Redis caching,
microservices split, auth scheme, database selection) showing the full
phase flow end to end. Load to see what good output looks like.references/session-ledger-template.md -- In-session Claim Ledger import and carry-forward summary format.references/review-notes.md -- Human-review notes confirming this package contains no executable helper or automatic state mechanism.This skill will:
This skill will not:
Before emitting a recommendation, confirm:
[TRUTH] or [CONSTRAINT]