AllToken — Slash Commands

Six slash-style AllToken commands — /alltoken-chat, /alltoken-image, /alltoken-video, /alltoken-search, /alltoken-models, /alltoken-cost — recognized in chat and run via stdlib Python recipes. Pair with alltoken for full project bootstrap.

Audits

Pass

Install

openclaw skills install alltoken-call

/alltoken-call — direct AllToken commands

This skill teaches the host agent to recognize six slash-style invocations in user prompts and run the corresponding recipe against the user's AllToken API key. Each command is a self-contained Python 3 stdlib script — no external SDK install required.

When to invoke this skill (vs alltoken)

User wants…Use
"Generate an image of X" / "Translate this with Claude" / "Show available models"This skill (alltoken-call)
"Build me an AllToken agent" / "Scaffold a chat app" / "Add AllToken to my project"alltoken instead
BothLoad both — they don't conflict

Prerequisites

  • ALLTOKEN_API_KEY exported in the environment the agent shells out to
  • python3 ≥ 3.10 available on PATH
  • Internet access from the agent's host

The agent should refuse to invoke any command in this skill if ALLTOKEN_API_KEY is unset, and prompt the user to set it first.

Trigger recognition

Match these patterns case-insensitively anywhere in the user's message (a leading / is canonical but not required):

TriggerCommand
/alltoken-chat, alltoken chat, "ask alltoken with model X"Command 1
/alltoken-image, alltoken image, "generate an image via alltoken", "draw with alltoken"Command 2
/alltoken-video, alltoken video, "make a video on alltoken"Command 3
/alltoken-search, alltoken search, "use alltoken web search", "find with alltoken"Command 4
/alltoken-models, alltoken models, "what alltoken models are available"Command 5
/alltoken-cost, alltoken cost, "how much did that cost" (when last response was an AllToken call)Command 6

Production constraints baked into the recipes

These were verified live on 2026-05-12 — they're not optional:

  1. Image results are one-shot. Persist b64_json on the first completed read; re-polling returns 410 image_already_retrieved.
  2. enable_search: true only works on DeepSeek and Qwen. OpenAI returns 503, Claude/GLM/Kimi/Minimax silently drop it. The /alltoken-search recipe defaults to deepseek-v4-pro for this reason.
  3. Streaming usage requires opt-in. Add stream_options: {"include_usage": True} or it'll be null.
  4. The API key reaches /v1/* only. /api-account/user/balance, /usage, /billing return 401 with the Bearer token — they need a web session. Do not attempt them.
  5. Errors are envelope-shaped: {"error": {"code": "<slug>", "type": "<group>", "message": "...", "param": null, "request_id": "..."}}. Always surface code and request_id to the user on failure.

1. /alltoken-chat

Syntax

/alltoken-chat <model> <prompt>
/alltoken-chat <prompt>                       # uses default: gpt-5.4-mini

Parameters

  • model (optional) — any ID from /v1/models. Common choices: gpt-5.4-mini, gpt-5.4, claude-sonnet-4-6, claude-opus-4-7, deepseek-v4-pro, gemini-3.1-pro-preview. Cheap defaults: gpt-5.4-nano, claude-haiku-4-5, gemini-3-flash-preview.
  • prompt (required) — free-form text.

Recipe — save as /tmp/at_chat.py, run with python3 /tmp/at_chat.py <model> <prompt...>:

import os, sys, json, urllib.request
args = sys.argv[1:]
if not args:
    print("usage: at_chat.py [model] <prompt...>"); sys.exit(2)
# Heuristic: first arg is model only if it has no spaces AND looks like an ID
first = args[0]
if len(args) >= 2 and "-" in first and " " not in first and "." in first or first.startswith(("gpt-","claude-","gemini-","deepseek-","glm-","qwen","kimi-","minimax-")):
    model, prompt = first, " ".join(args[1:])
else:
    model, prompt = "gpt-5.4-mini", " ".join(args)
body = json.dumps({
    "model": model,
    "messages": [{"role":"user","content": prompt}],
    "stream": True,
    "stream_options": {"include_usage": True},
}).encode()
req = urllib.request.Request("https://api.alltoken.ai/v1/chat/completions",
    data=body, method="POST",
    headers={"Authorization": f"Bearer {os.environ['ALLTOKEN_API_KEY']}", "Content-Type":"application/json"})
try:
    r = urllib.request.urlopen(req, timeout=120)
except urllib.error.HTTPError as e:
    err = json.loads(e.read()).get("error", {})
    print(f"\n[error {e.code}] {err.get('code')}/{err.get('type')}: {err.get('message')}  req={err.get('request_id')}")
    sys.exit(1)
usage = None
for raw in iter(r.readline, b""):
    line = raw.decode("utf-8","replace").rstrip("\n")
    if not line or line.startswith(":"): continue
    if line.startswith("data: "):
        data = line[6:]
        if data == "[DONE]": break
        obj = json.loads(data)
        if obj.get("usage"): usage = obj["usage"]
        for ch in obj.get("choices", []):
            c = ch.get("delta", {}).get("content")
            if c: sys.stdout.write(c); sys.stdout.flush()
print()
if usage:
    print(f"\n[usage] prompt={usage['prompt_tokens']} completion={usage['completion_tokens']} total={usage['total_tokens']} model={model}")

Agent presentation — after running, show the streamed text to the user and, in a separate line, surface the token usage (prompt + completion = total). If the user follows up with /alltoken-cost, that line is what gets multiplied by per-token prices.


2. /alltoken-image

Syntax

/alltoken-image <prompt> [--size=1024x1024] [--quality=low|medium|high] [--out=PATH]

Parameters

  • prompt (required)
  • --size1024x1024 (default), 1536x1024, 1024x1536, auto
  • --qualitylow (default for speed), medium, high, auto
  • --out — output path (default: ./alltoken-image-<8charhex>.png in cwd)

Recipe — save as /tmp/at_image.py:

import os, sys, json, time, base64, uuid, argparse, urllib.request
ap = argparse.ArgumentParser()
ap.add_argument("prompt", nargs="+")
ap.add_argument("--size", default="1024x1024")
ap.add_argument("--quality", default="low", choices=["low","medium","high","auto"])
ap.add_argument("--out", default=None)
a = ap.parse_args()
prompt = " ".join(a.prompt)
out = a.out or f"alltoken-image-{uuid.uuid4().hex[:8]}.png"
H = {"Authorization": f"Bearer {os.environ['ALLTOKEN_API_KEY']}", "Content-Type":"application/json"}
body = json.dumps({"model":"gpt-image-2","prompt":prompt,"size":a.size,"quality":a.quality}).encode()
req = urllib.request.Request("https://api.alltoken.ai/v1/images/generations/async",
    data=body, method="POST", headers={**H, "Idempotency-Key": str(uuid.uuid4())})
try:
    created = json.loads(urllib.request.urlopen(req, timeout=60).read())
except urllib.error.HTTPError as e:
    print(f"[error] submit failed: {e.code} {e.read().decode()[:300]}"); sys.exit(1)
task_id = created["id"]; t0 = time.time()
print(f"submitted {task_id} (size={a.size} quality={a.quality})", flush=True)
while True:
    time.sleep(2)
    req = urllib.request.Request(f"https://api.alltoken.ai/v1/images/generations/{task_id}", headers=H)
    s = json.loads(urllib.request.urlopen(req, timeout=30).read())
    print(f"  [{time.time()-t0:.0f}s] {s['status']}", flush=True)
    if s["status"] == "completed":
        # ONE-SHOT: write immediately, never re-poll
        with open(out, "wb") as f: f.write(base64.b64decode(s["data"][0]["b64_json"]))
        print(f"saved {out} ({os.path.getsize(out)} bytes) in {time.time()-t0:.1f}s")
        u = s.get("usage", {}); print(f"[usage] input={u.get('input_tokens')} output={u.get('output_tokens')} total={u.get('total_tokens')}")
        break
    if s["status"] in ("failed","cancelled"):
        print(f"[error] task ended: {s.get('error')}"); sys.exit(1)

Agent presentation — confirm the file path, embed/preview the image if the host supports it, and warn the user that the result is gone from the server after retrieval. If the user asks for a variation, run a new /alltoken-image rather than re-polling the old task.


3. /alltoken-video

Syntax

/alltoken-video <prompt> [--model=seedance-1.5-pro] [--duration=5] [--ratio=16:9] [--resolution=480p|720p|1080p]

Parameters

  • prompt (required)
  • --modelseedance-1.5-pro (default), seedance-2.0, happyhorse-1.0-t2v, happyhorse-1.0-i2v. Check /v1/videos/models for the full list.
  • --duration — seconds, default 5
  • --ratio16:9 (default), 9:16, 4:3, 3:4, 21:9, 1:1, adaptive
  • --resolution480p (default), 720p, 1080p

Recipe — save as /tmp/at_video.py:

import os, sys, json, time, argparse, urllib.request
ap = argparse.ArgumentParser()
ap.add_argument("prompt", nargs="+")
ap.add_argument("--model", default="seedance-1.5-pro")
ap.add_argument("--duration", type=int, default=5)
ap.add_argument("--ratio", default="16:9")
ap.add_argument("--resolution", default="480p", choices=["480p","720p","1080p"])
a = ap.parse_args()
H = {"Authorization": f"Bearer {os.environ['ALLTOKEN_API_KEY']}", "Content-Type":"application/json"}
body = json.dumps({"model":a.model,"prompt":" ".join(a.prompt),"duration":a.duration,"ratio":a.ratio,"resolution":a.resolution}).encode()
req = urllib.request.Request("https://api.alltoken.ai/v1/videos/generations", data=body, method="POST", headers=H)
try:
    created = json.loads(urllib.request.urlopen(req, timeout=60).read())
except urllib.error.HTTPError as e:
    print(f"[error] {e.code} {e.read().decode()[:300]}"); sys.exit(1)
vid = created["id"]; t0 = time.time()
print(f"submitted {vid}", flush=True)
while True:
    time.sleep(3)
    req = urllib.request.Request(f"https://api.alltoken.ai/v1/videos/generations/{vid}", headers=H)
    s = json.loads(urllib.request.urlopen(req, timeout=30).read())
    print(f"  [{time.time()-t0:.0f}s] {s['status']}", flush=True)
    if s["status"] == "completed":
        url = s.get("video_url")
        ttl = s.get("video_url_ttl", "?")
        print(f"video_url ({ttl}s TTL): {url}")
        print(f"resolution={s.get('resolution')} ratio={s.get('ratio')} fps={s.get('fps')}")
        break
    if s["status"] in ("failed","cancelled","expired"):
        print(f"[error] task ended: {s.get('error')}"); sys.exit(1)

Agent presentation — give the video_url (presigned, expires in video_url_ttl seconds) and remind the user to download promptly. If they want to cancel mid-generation, POST /v1/videos/generations/{id}/cancel.


4. /alltoken-search

Syntax

/alltoken-search <query>
/alltoken-search --model=qwen3.6-flash <query>

Parameters

  • query (required) — natural-language question
  • --model — defaults to deepseek-v4-pro. Only DeepSeek and Qwen models honor enable_search:true; choose one of: deepseek-v4-pro, deepseek-v3.2, qwen3.6-flash, qwen3.6-max-preview. If you pass any other model the recipe will refuse and tell the user why.

Recipe — save as /tmp/at_search.py:

import os, sys, json, argparse, urllib.request
SEARCH_OK = {"deepseek-v4-pro","deepseek-v3.2","qwen3.6-flash","qwen3.6-max-preview","qwen3.6-plus","qwen3.6-27b"}
ap = argparse.ArgumentParser()
ap.add_argument("query", nargs="+")
ap.add_argument("--model", default="deepseek-v4-pro")
a = ap.parse_args()
if a.model not in SEARCH_OK:
    print(f"[refuse] {a.model} does NOT honor enable_search on AllToken today.")
    print(f"        Use one of: {', '.join(sorted(SEARCH_OK))}")
    sys.exit(2)
body = json.dumps({
    "model": a.model,
    "messages": [{"role":"user","content":" ".join(a.query)}],
    "enable_search": True,
    "max_tokens": 600,
}).encode()
req = urllib.request.Request("https://api.alltoken.ai/v1/chat/completions",
    data=body, method="POST",
    headers={"Authorization": f"Bearer {os.environ['ALLTOKEN_API_KEY']}", "Content-Type":"application/json"})
try:
    r = urllib.request.urlopen(req, timeout=120)
except urllib.error.HTTPError as e:
    err = json.loads(e.read()).get("error", {})
    print(f"[error {e.code}] {err.get('code')}/{err.get('type')}: {err.get('message')}  req={err.get('request_id')}"); sys.exit(1)
j = json.loads(r.read())
m = j["choices"][0]["message"]
print(m.get("content","").strip())
u = j.get("usage", {})
print(f"\n[usage] prompt={u.get('prompt_tokens')} completion={u.get('completion_tokens')} total={u.get('total_tokens')} model={a.model}")

Agent presentation — the response should look like a search-grounded answer (current dates, specific numbers). If the model says "I don't have web search" anyway, the family is honoring the flag but the underlying provider failed — re-run with a different model.


5. /alltoken-models

Syntax

/alltoken-models
/alltoken-models --type=chat        # chat (default), image, video
/alltoken-models --filter=claude    # substring filter on ID

Recipe — save as /tmp/at_models.py:

import os, sys, json, argparse, urllib.request
ap = argparse.ArgumentParser()
ap.add_argument("--type", default="chat", choices=["chat","image","video"])
ap.add_argument("--filter", default="")
a = ap.parse_args()
path = {"chat":"/v1/models","image":"/v1/images/models","video":"/v1/videos/models"}[a.type]
req = urllib.request.Request(f"https://api.alltoken.ai{path}",
    headers={"Authorization": f"Bearer {os.environ['ALLTOKEN_API_KEY']}"})
j = json.loads(urllib.request.urlopen(req, timeout=30).read())
ids = [m["id"] for m in j["data"]]
if a.filter: ids = [i for i in ids if a.filter.lower() in i.lower()]
print(f"{a.type}: {len(ids)} model(s)" + (f" matching '{a.filter}'" if a.filter else ""))
for i in ids: print(f"  {i}")

Agent presentation — list IDs in a code block. If the user asks "which is cheapest / fastest / best for code", cross-reference the verified-working table in alltoken/SKILL.md ## Discovering models.


6. /alltoken-cost

Compute the cost of a chat call from its usage block + per-million prices from the catalog. Useful right after /alltoken-chat or /alltoken-search.

Syntax

/alltoken-cost <model> <prompt_tokens> <completion_tokens>

Recipe — save as /tmp/at_cost.py:

import os, sys, json, urllib.request
if len(sys.argv) != 4:
    print("usage: at_cost.py <model> <prompt_tokens> <completion_tokens>"); sys.exit(2)
model, pt, ct = sys.argv[1], int(sys.argv[2]), int(sys.argv[3])
# Public catalog — no auth required
req = urllib.request.Request(f"https://api.alltoken.ai/api-account/models/{model}")
try:
    j = json.loads(urllib.request.urlopen(req, timeout=30).read())
except urllib.error.HTTPError as e:
    print(f"[error] catalog lookup failed: {e.code}"); sys.exit(1)
# Pricing fields vary; try common keys
data = j.get("data", j)
p_in  = float(data.get("input_price")  or data.get("prompt_price")     or 0)
p_out = float(data.get("output_price") or data.get("completion_price") or 0)
# Convention: prices are per 1M tokens
cost = (pt / 1_000_000) * p_in + (ct / 1_000_000) * p_out
print(f"model:            {model}")
print(f"prompt tokens:    {pt:>10,}")
print(f"completion tokens:{ct:>10,}")
print(f"input  $/1M:      ${p_in:.4f}")
print(f"output $/1M:      ${p_out:.4f}")
print(f"───")
print(f"total cost:       ${cost:.6f}")

Agent presentation — surface the total in $ to 6 decimal places; for streaming chat, prefer reading the exact figure from the after-[DONE] SSE comment line (which is the authoritative per-request cost from AllToken's gateway). Use this recipe as a fallback when that comment line wasn't captured.

Tip: pricing fields in the catalog response have evolved — if input_price/output_price aren't populated, fall back to inspecting data keys: curl https://api.alltoken.ai/api-account/models/<model> | jq 'keys'.


Error handling (shared across all commands)

When any recipe hits a non-2xx response, the AllToken envelope is {"error": {"code", "type", "message", "param", "request_id"}}. The agent should:

  1. Show the user code and message.
  2. Include request_id in any support communication.
  3. For specific slugs, take action without prompting:
    • invalid_api_key (401): tell the user the key is bad; do not retry.
    • image_already_retrieved (410): tell the user to re-run; the result is gone.
    • all_providers_failed (503): try a different model from the same family, then a different family.
    • rate_limited / HTTP 429: read Retry-After (integer seconds), sleep, retry once.
    • insufficient_balance (402): tell the user to top up in Settings → Billing.

Companion skill

If the user wants to build something instead of one-shot calls, hand off to alltoken:

"If you want me to scaffold a whole agent project around this, load skills/alltoken/SKILL.md."

Resources