Install
openclaw skills install sharkEnables non-blocking AI agent execution by spawning parallel remora subagents for slow tasks, keeping the main agent responsive and efficient.
openclaw skills install sharkA shark that stops swimming dies. An agent that waits for tools wastes compute.
Works with: Claude Code · Codex · Gemini CLI · Cursor · Windsurf · Aider · OpenClaw · any LLM agent
Trigger this skill when the user says:
Every LLM turn must complete in under 30 seconds.
If any operation would take longer:
sessions_spawn with mode: "run")You are never in I/O wait. You are always reasoning about something.
┌─────────────┐
│ DECOMPOSE │ Break task into N independent subtasks
└──────┬──────┘
│ spawn N remoras (+ 1 pilot fish when first completes early)
▼
┌─────────────┐
│ SPAWN │ sessions_spawn × N, all parallel, record session IDs
└──────┬──────┘
│ main agent keeps reasoning (never waits)
▼
┌─────────────┐ timeout/crash
│ MONITOR │ ──────────────────► MARK ⏱/❌ (partial still useful)
└──────┬──────┘
│ all done OR deadline hit
▼
┌─────────────┐
│ AGGREGATE │ Collect results, note failures, merge pilot fish draft
└──────┬──────┘
│
▼
┌─────────────┐
│ REPORT │ Single coherent response with failure count noted
└─────────────┘
No nested remoras. If a remora is running, it executes inline — remoras cannot spawn their own remoras. Only the main shark spawns.
think → call slow tool → WAIT 60s → think → call slow tool → WAIT 45s → ...
think → spawn remora(slow tool) → think about something else
→ spawn remora(another tool) → synthesize partial results
→ receive remora result → incorporate → swim on
When applying the Shark Pattern, structure your work like this:
Before calling any tool, ask: "Will this take more than 20-30 seconds?"
Slow tools (always spawn):
Fast tools (run inline, never spawn):
sessions_spawn({
task: "Do the slow thing and return the result",
mode: "run",
runtime: "subagent",
streamTo: "parent" // optional: stream output back
})
Spawn multiple remoras in parallel when possible — don't serialize unless there's a data dependency.
After spawning, immediately continue:
When remora results arrive, weave them in and continue. Never re-do work a remora already completed.
If your runtime keeps subagents alive after completion, close them once you've incorporated their result. In Codex that means: wait for the remora, use its output, then close_agent(id) unless you intentionally plan to reuse that same agent.
| Operation | Budget | Action |
|---|---|---|
| File read | < 2s | Inline |
| Web search | 5-30s | Spawn |
| SSH command | 10-120s | Spawn |
| Build/test | 30-300s | Spawn |
| Coding agent | 60-600s | Spawn |
| Memory search | < 3s | Inline |
Without Shark (blocking):
1. Search web for X [wait 15s]
2. Search web for Y [wait 12s]
3. Fetch page Z [wait 8s]
4. SSH check server [wait 30s]
Total: ~65 seconds blocked
With Shark (non-blocking):
1. Spawn: search X [0s - spawned]
2. Spawn: search Y [0s - spawned]
3. Spawn: fetch Z [0s - spawned]
4. Spawn: SSH check [0s - spawned]
5. Plan synthesis while waiting [15s of actual thinking]
6. All results arrive → synthesize
Total: ~15s of thinking + max(tool times) in parallel
🦈 Shark mode — spawning [N] remoras for [tasks], continuing...
Use this format after each remora or pilot fish completes. Works in Telegram, Discord, Signal, iMessage — anywhere.
🦈 3 remoras · 1 pilot fish
◉ [A] task name here ████████████ ✅ 9s
◉ [B] task name here ████████████ ✅ 33s
○ [C] task name here ░░░░░░░░░░░░ pending
◈ [P] Pilot fish ██████░░░░░░ ~14s left
↳ continuing...
Symbols:
◉ = remora (completed)○ = remora (pending)⊙ = remora (running)◈ = pilot fish (time-bounded)████████████ = done bar (12 blocks)██████░░░░░░ = partial (filled = elapsed / total budget)░░░░░░░░░░░░ = not startedProgress fill: filled = round(elapsed / timeout * 12) blocks of █, remainder ░
Only post an update when something changes (remora completes or pilot fish starts/ends). Don't spam — one update per event.
After all remoras done:
🦈 All fins in — synthesising [N] results + pilot draft
Then deliver the report.
Pilot fish swim alongside sharks doing prep work. When you have idle time, use it.
When one remora returns early and others are still running:
remora A ──────► result (early)
remora B ────────────────────────────► result
remora C ──────────────────────────────────► result
main: spawn A, B, C
A done → spawn pilot-fish(A's result, timeout=est_remaining)
pilot-fish: pre-analyse A, draft partial report, validate data...
B done → pilot-fish still running, feed B's result in (or kill+reuse)
C done → kill pilot-fish, synthesise A+B+C+pilot-fish draft
runTimeoutSeconds equal to estimated remaining wait// remoras A (fast) and B (slow) both spawned
// A finishes in 10s, B will take another 30s
// Spawn pilot fish with 25s budget:
sessions_spawn({
task: "Pre-analyse these results from remora A.
Validate the data, note any gaps, draft the structure
of the final report. Stop after 25 seconds.",
runTimeoutSeconds: 25,
mode: "run"
})
// Main agent continues doing other work
// When B finishes → kill pilot fish → synthesise A + B + pilot draft
Before every tool call, ask: "Will this take more than 10 seconds?"
Estimated time < 10s? → run inline
Estimated time ≥ 10s? → spawn remora
Unknown latency? → spawn remora (assume slow)
Data dependency on another remora? → wait, then inline
Already at 8 remoras? → queue, don't stack
Always spawn: web search/fetch, SSH, build/test, coding agents, CI triggers, API calls with unknown latency Always inline: file read, memory lookup, string ops, math, local config reads
remoras will fail, timeout, or return garbage. Plan for it.
◉ [A] task ████████████ ⏱ 30s [timeout]
◉ [A] task ████████████ ❌ [error: connection refused]
close_agent(id) once their output is delivered.⏱ = timeout with partial, ❌ = hard error with nothingsessions_spawn call with mode: "run", runtime: "subagent", and runTimeoutSeconds set. A remora is specifically a timed sub-agent — untimed subagents are not remoras.runTimeoutSeconds — confirmed realVerified against OpenClaw source: runTimeoutSeconds: z.number().int().min(0).optional() — maps to the subagent wait timeout. Use it. Hard-kills the sub-agent process after N seconds, partial output returned.
pilotFishTimeout = min(estimatedRemaining * 0.8, 25)
estimatedRemaining = how long you think the slowest remaining remora will takeExample: slowest remaining remora estimated at 30s → pilot fish timeout = min(24, 25) = 24s
yieldMs > 30000 in exec calls — this holds the main turn hostageprocess(action=poll, timeout > 20000) in the main session — same reasonsleep or wait loops in the main threadrunTimeoutSeconds on remoras — unbound sub-agents are not sharksThe 30s cap isn't just a guideline — here's how to actually enforce it per runtime.
sessions_spawn({
task: "...",
mode: "run",
runtime: "subagent",
runTimeoutSeconds: 30 // hard kill after 30s — agent gets SIGTERM
})
runTimeoutSeconds is enforced by the OpenClaw runtime — the sub-agent process is killed if it exceeds it. Partial output is still returned.
exec({
command: "some-slow-command",
timeout: 30, // hard kill in seconds
background: true, // don't block the main agent turn
yieldMs: 500 // poll back quickly to check
})
timeout kills the process. background: true means the main agent doesn't wait — it gets a session handle and can check back with process(poll).
timeout 30 gemini -p "task here"
# or on Windows:
Start-Process gemini -ArgumentList '-p "task"' -Wait -Timeout 30
Wrap the CLI invocation with OS-level timeout / Start-Process -Timeout.
runTimeoutSecondssessions_spawn({
task: "pre-analyse partial results, draft structure, flag gaps",
mode: "run",
runTimeoutSeconds: estimatedRemainingMs / 1000, // die before the last remora
})
Set it to slightly less than your estimated remaining wait — so the pilot fish always finishes before you need to synthesise.
[timeout] in the progress bar instead of ✅⊙ [A] slow task ████████████ ⏱ 30s [timeout — partial result]
You can't hard-kill an LLM mid-turn, but you can:
thinking: "none" for fast sub-tasks that don't need deep reasoningRule of thumb: if a task description is >3 sentences, it probably needs to be split into remoras.
The Shark Pattern is runtime-agnostic. remoras can be any agent type.
sessions_spawn({
task: "...",
mode: "run",
runtime: "subagent",
runTimeoutSeconds: 30 // hard cap for pilot fish
})
sessions_spawn({
task: "...",
runtime: "acp",
agentId: "codex",
mode: "run",
runTimeoutSeconds: 30
})
Codex-specific lifecycle:
spawn_agent(...) or the runtime-equivalent remora launcherwait_agent(...)send_input(...)close_agent(id) so the agent does not linger in the sessionGemini CLI is a local process — spawn via exec with a timeout:
exec({
command: "gemini -p \"task description here\"",
timeout: 30, // hard cap in seconds
background: true, // don't block main agent
yieldMs: 500 // check back quickly
})
For Gemini sub-tasks, use exec with timeout + background: true rather than sessions_spawn. Treat the process handle the same way — continue working, collect output when it lands.
You can mix runtimes in the same shark run:
spawn remora A → Codex (coding task)
spawn remora B → Gemini (web search / analysis)
spawn remora C → Claude subagent (reasoning)
spawn pilot fish → Claude subagent (pre-analysis, time-bounded)
| Task type | Best runtime |
|---|---|
| Code generation / editing | Codex |
| Web search / summarise | Gemini CLI |
| Multi-step reasoning | Claude subagent |
| File ops / SSH / shell | exec (background) |
| Pre-analysis / drafting | Claude subagent (pilot fish) |
For slow shell commands (>5s), use the shark-exec companion skill:
shark-exec/SKILL.md in this repoexec call in background + cron pollerThe 30-second rule is best enforced at the shell level, not inside a turn.
Use shark.sh (or shark.ps1 on Windows) to run Claude in a bounded loop:
./shark.sh "find the latest ChatterPC version, check pve3, summarise GitHub issues"
Each iteration:
claude --print with a hard timeout 25s shell wrapper.shark-done → loop exitsThis is identical to the Ralph Loop pattern, but with the Shark Pattern as the prompt — Claude spawns remoras for slow work, keeps each turn under 25s, and the shell loop enforces the hard cut.
| Use case | Approach |
|---|---|
| Single fast task (<30s total) | claude --print "..." directly |
| Multi-step task, slow tools | ./shark.sh "..." loop |
| CI/build watching | shark-exec (background + cron) |
| Interactive chat | OpenClaw main session |
| Variable | Default | Description |
|---|---|---|
SHARK_MAX_LOOPS | 50 | Maximum iterations before giving up |
SHARK_LOOP_TIMEOUT | 25 | Per-turn timeout in seconds (hard kill) |
When Claude determines the task is done, it writes to .shark-done:
TASK_COMPLETE
<brief summary of what was accomplished>
The loop detects this file and exits cleanly.
When the user invokes these commands, follow the instructions for each.
/shark <task>Apply the Shark Pattern to the given task. Decompose, spawn remoras for slow ops, keep the main fin moving. Follow all rules in this SKILL.md.
/shark-loop <task> [--max-loops N] [--timeout S]Run the external shark loop enforcer. Execute:
$env:SHARK_MAX_LOOPS = "<N>"
$env:SHARK_LOOP_TIMEOUT = "<S>"
powershell.exe -ExecutionPolicy Bypass -File "<skill_dir>/shark.ps1" "<task>"
Defaults: --max-loops 50, --timeout 25. On Linux/Mac use shark.sh instead.
/shark-statusCheck current shark state:
<skill_dir>/shark-exec/state/pending.json — report active background jobs (label, command, elapsed time, whether overdue past maxSeconds).shark-done exists, show its contentsSHARK_LOG.md exists, show the last 10 lines/shark-cleanRemove shark state files: .shark-done, SHARK_LOG.md, shark-exec/state/pending.json. Report what was cleaned.
/shark-autotuneAnalyse timing history and recommend optimal settings.
Read <skill_dir>/state/timings.jsonl — each line is:
{"ts":1710000000,"loop":1,"elapsed_s":12.3,"timeout_s":25,"result":"ok|timeout|done","task_hash":"abc123"}
If no data, report "No timing data yet. Run tasks with /shark first."
Compute and report:
Show recommendations:
Current: SHARK_LOOP_TIMEOUT=25 SHARK_MAX_LOOPS=50
Recommended: SHARK_LOOP_TIMEOUT=N SHARK_MAX_LOOPS=M
Rationale:
- p95 turn time is Xs, so timeout of Ns covers 95% with buffer
- p95 completion is N loops, so max_loops of M gives safe margin
- Timeout rate is X% — [>15%: consider splitting tasks | healthy]
- Wasted headroom: Xs total
If timeout rate > 30%: "Consider breaking tasks into smaller steps."
If median turn time < 5s: "Most turns complete fast. Consider lowering timeout."
Both shark.sh and shark.ps1 automatically record per-loop timings to state/timings.jsonl. Each entry includes:
ts — Unix timestamploop — loop iteration numberelapsed_s — actual wall-clock seconds for this turntimeout_s — configured timeout for this runresult — "ok" (completed), "timeout" (hit limit), "done" (task finished)task_hash — 8-char hash correlating loops within a single runUse /shark-autotune to analyse this data and tune your settings.
mode: "run", runtime: "subagent"npm install -g @google/gemini-cli