Install
openclaw skills install leanstral-formal-verificationFormal verification using Lean 4 + Leanstral (labs-leanstral-2603) model. Use when: you need mathematical proof of code correctness, protocol verification, algorithm correctness, security property proofs, or any property that can be expressed as a logical theorem. Triggers: "formal proof", "formal verification", "Lean proof", "mathematical proof", "theorem proving", "Leanstral", "code verification", "correctness proof"
openclaw skills install leanstral-formal-verificationA skill for formal verification using Lean 4 + Mathlib + the Leanstral model
(labs-leanstral-2603) from Mistral AI to mathematically prove code properties.
🔑 Requires a Mistral API key. Set via
MISTRAL_API_KEYenvironment variable. The Leanstral model is available for free via Mistral's API as of 2026-05-17. Get a key at: https://console.mistral.ai/api-keysexport MISTRAL_API_KEY="your-key-here"⚠️ When using this skill:
- Theorem statements and code are sent to Mistral's API. Do not include confidential code, secrets, or proprietary protocol details in prompts.
- Generated
.leanfiles are untrusted code —verify.shrejects proof holes (sorry/admit) and compiles in an isolated project directory.- Always review the generated theorem statements. Lean verifies that the proof is valid, not that the theorem matches your intent.
Leanstral is Mistral AI's first open-source code agent for Lean 4 (released March 2026). It is specialized for theorem proving — it generates proofs, and Lean 4 verifies them mechanically. When a proof passes, correctness is a mathematical fact, not a probability.
| Spec | Value |
|---|---|
| Architecture | Mixture of Experts (128 experts, 4 active per token) |
| Total parameters | 119B |
| Active parameters | 6.5B per token |
| Context length | 256K tokens |
| License | Apache 2.0 (open weights) |
| API model ID | labs-leanstral-2603 |
| API base URL | https://api.mistral.ai/v1 |
General-purpose LLMs (GPT, Claude, etc.) write code and hope it works — they test a few inputs but miss edge cases. Leanstral writes code and proves it correct for all inputs. When Lean 4 accepts the proof, correctness is guaranteed mathematically.
| Model | Cost ($) | Score |
|---|---|---|
| Claude Haiku | 184 | 23.0 |
| Claude Sonnet | 549 | 23.7 |
| Claude Opus 4.6 | 1,650 | 39.6 |
| Leanstral pass@1 | 18 | 21.9 |
| Leanstral pass@2 | 36 | 26.3 |
| Leanstral pass@4 | 72 | 29.3 |
| Leanstral pass@16 | 290 | 31.9 |
Pass@16 beats Sonnet at 1/2 the cost, approaches Opus at 1/6 the cost.
The primary way to use this skill: call the Mistral API directly.
curl -X POST "https://api.mistral.ai/v1/chat/completions" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "labs-leanstral-2603",
"temperature": 1.0,
"max_tokens": 32000,
"messages": [{
"role": "user",
"content": "Prove the following theorem in Lean 4:\n\ntheorem add_comm (a b : Nat) : a + b = b + a := by\n sorry"
}]
}'
Or with Python:
from openai import OpenAI
client = OpenAI(
api_key="***",
base_url="https://api.mistral.ai/v1"
)
response = client.chat.completions.create(
model="labs-leanstral-2603",
temperature=1.0,
max_tokens=32000,
messages=[{"role": "user", "content": "Prove that..."}],
)
print(response.choices[0].message.content)
| Parameter | Recommended | Why |
|---|---|---|
temperature | 1.0 | Diverse proof strategies. Lower values produce repetitive attempts. |
max_tokens | 32000 | Proofs are verbose. Generous output budget avoids truncation. |
reasoning_effort | "high" (Mistral-specific) | Required for non-trivial proofs. Drop to "medium" only for simple boolean logic. |
Leanstral improves significantly with multiple attempts. If a proof fails on the first try, call the API again — the model explores different proof strategies each time. pass@2 adds +4.4 points to the score. For automated workflows, loop with retry logic:
for i in 1 2 3; do
curl -s -X POST "https://api.mistral.ai/v1/chat/completions" \
-H "Authorization: Bearer $MISTRAL_API_KEY" \
-H "Content-Type: application/json" \
-d "{\"model\":\"labs-leanstral-2603\",\"temperature\":1.0,\"max_tokens\":32000,\"messages\":[{\"role\":\"user\",\"content\":\"$(cat proof_request.txt | jq -Rs .)\"}]}" \
| jq -r '.choices[0].message.content' > proof_attempt_$i.lean
# Verify with Lean 4
bash verify.sh proof_attempt_$i.lean && break
done
Before using this skill, set up a Lean 4 project on your machine:
lake new formal-verificationlakefile.leanlake build once to build the Mathlib cache (~500MB, one-time cost)verify.sh script in the project root (see below)#!/bin/bash
# verify.sh — compile and check a Lean 4 proof file
# Usage: bash verify.sh /path/to/proof.lean
set -e
export PATH="$HOME/.elan/bin:$PATH"
PROJECT_DIR="<your-lean-project-dir>"
cp "$1" "$PROJECT_DIR/Proof.lean"
cd "$PROJECT_DIR"
lake build
Use cases:
Do not use:
1. Write the code + specification in Lean 4
2. Call Leanstral API: "Prove this implementation satisfies the specification"
3. Compile the returned proof with verify.sh
4. Pass → correctness guaranteed. Fail → failure point is the bug location.
1. Write the specification in Lean 4 first (what "correct" means)
2. Call Leanstral to generate both implementation AND proof
3. Compile — when the proof passes, development is complete. No tests needed.
1. Observe the bug's symptoms
2. Write the "correct behavior" specification in Lean 4
3. Call Leanstral to prove the current implementation satisfies it
4. The point where the proof fails = the bug's root cause
5. Fix, then re-prove to confirm
1. Write the pre- and post-refactoring code
2. Define equivalence as a specification
3. Call Leanstral to prove they are equivalent
4. Proof passes → refactoring is mathematically safe
1. Model the state machine as an inductive proposition in Lean 4
2. Formulate properties: "no deadlocks", "no unreachable states", etc.
3. Call Leanstral to prove them
→ Catches concurrency bugs exhaustively, not probabilistically
Clearly state the properties to be proven in natural language. For example, when verifying a code change that adds a new boolean guard condition:
Convert the natural language properties into Lean 4 types and theorems:
-- Model the relevant boolean conditions as variables
variable (timedOut compactDuring timeExecuting userAborted : Bool)
-- Condition BEFORE the change
def triggerOld : Bool := timedOut && !compactDuring && !timeExecuting
-- Condition AFTER the change (adds userAborted guard)
def triggerNew : Bool := timedOut && !compactDuring && !timeExecuting && !userAborted
Send the model + theorems to the Leanstral API with a prompt like:
Prove the following theorems in Lean 4 using Mathlib.
Use only the tactics: simp, tauto, rw, cases, rfl, constructor.
[paste your model + theorems here]
# Save Leanstral's response to a .lean file
echo "$LEANSTRAL_OUTPUT" > FormalVerification.lean
# Compile and verify
bash <lean-project-dir>/verify.sh FormalVerification.lean
Build completed successfully → Proof completedWhen instructing Leanstral via the API, always provide:
[Context]
- Lean 4 version in use
- Dependencies (Mathlib, specific modules)
- The code/specification to verify
[Instruction]
- What property to prove
- Optional: suggested tactics or approach
- Any constraints (e.g., "only use simp and tauto")
-- Define the specification FIRST
def sorted_correct (arr : Array Nat) : Prop :=
∀ i j, i < j → j < arr.size → arr[i]! ≤ arr[j]!
-- Send to Leanstral: "Prove that bubbleSort satisfies sorted_correct"
def bubbleSort (arr : Array Nat) : Array Nat := by
sorry -- implementation
theorem bubbleSort_correct (arr : Array Nat) :
sorted_correct (bubbleSort arr) := by
sorry -- Leanstral generates this proof
If you have already configured a Leanstral sub-agent in your OpenClaw gateway (this is separate setup — see the gateway's agent configuration), you can delegate proof generation to it:
sessions_spawn:
agentId: <your-leanstral-agent-id>
task: |
You are a Lean 4 formal verification expert.
## Context
[Description of the verification target and properties to prove]
## Environment
Lean 4 is available on the HOST. To compile:
```bash
export PATH="$HOME/.elan/bin:$PATH"
bash <lean-project-dir>/verify.sh /workspace/your-file.lean
```
## Task
1. Identify the properties to be verified
2. Build a formal model in Lean 4
3. Write and prove theorems
4. Verify compilation using verify.sh (mandatory)
5. If it fails, read the error, fix the proof, and recompile
6. Report the final compilation output
Save to: /workspace/FormalVerification.lean
Recommended sub-agent parameters (if configuring one):
| Setting | Recommended |
|---|---|
| Model | labs-leanstral-2603 (Mistral) |
| Temperature | 1.0 |
| Max tokens | 32000 |
| Thinking / reasoning | high |
| Timeout | 600–1200 seconds |
| Fallback models | Any capable reasoning models |
Lean 4 with Mathlib requires ~500MB of cached dependencies. Run it directly on
your machine — containerizing it is impractical. The verify.sh script compiles
.lean files using the host's Lean installation.
The Leanstral model can take minutes to generate proofs. Countermeasures:
| Field | Example |
|---|---|
| Algorithms | Correctness of sorting/searching, proof of computational complexity |
| Security | Safety of authentication protocols, access control properties |
| Business logic | Consistency of fee calculations, exhaustiveness of discount rules |
| Data integrity | Satisfaction of DB constraints, safety of schema migrations |
| Protocols | Deadlock-free state transitions, message ordering guarantees |
| Mathematics | Correctness of statistical calculations, equivalence of formula transformations |
| State machines | No unreachable states, all transitions defined, invariant preservation |
| Refactoring | Pre/post equivalence proofs, behavioral preservation guarantees |
theoremlake build checks the proofs| Tactic | Use |
|---|---|
simp | Expand definitions and simplify |
tauto | Automatic proof of propositional logic |
rw [h] | Rewrite using assumption h |
cases | Case analysis |
intro h | Introduce implication |
rfl | Trivial equality |
constructor | Split a conjunction |
| Limitation | Detail |
|---|---|
| Lean 4 only | Does not support Coq, Isabelle, Agda, or other proof assistants |
| I/O-poor | File systems, networking, databases cannot be formalized practically |
| Learning curve | Requires understanding of Lean 4's tactic language |
| Not for full codebases | Formalizing an entire project is impractical — focus on critical core |
| Proof length | Complex proofs can be very long and may exceed context |
| Hallucinations | Rarely generates non-existent theorems/tactics — Lean catches these |
error: unknown identifier 'foo'
→ The definition does not exist. Define it with def or add an import.
error: type mismatch
→ Type mismatch. Check the variable type declaration.
error: tactic 'simp' failed
→ simp cannot prove it. Use more specific tactics (cases, rw).
# Basic retry loop: try up to 3 times, stop on first success
for i in 1 2 3; do
echo "Attempt $i..."
curl -s ... | python3 -c "import sys,json; print(json.load(sys.stdin)['choices'][0]['message']['content'])" > proof.lean
if bash verify.sh proof.lean 2>/dev/null; then
echo "✅ Proof verified on attempt $i"
break
fi
echo "❌ Attempt $i failed, retrying..."
done