Spraay Compute & Futures

Security

Rent GPU compute, run AI model inference (LLM chat, image, video, speech-to-text, text-to-speech, embeddings), and buy prepaid compute-futures credits — all paid in USDC over x402, no API keys and no accounts. Use this skill whenever an agent needs to rent GPU time, run or batch model inference, generate images/video/audio, create text embeddings for RAG, estimate compute cost before spending, or lock in discounted compute by prepaying into a compute-futures credit balance. Trigger it for any phrasing like "rent GPU", "run inference", "generate an image/video", "transcribe audio", "text to speech", "get embeddings", "prepay compute", "compute credits", or "compute futures", even when x402 or Spraay is not named.

Install

openclaw skills install spraay-compute

Spraay Compute & Futures 💧

Two capabilities, equal billing:

  1. Compute rental — pay-per-call GPU and model inference (LLM, image, video, TTS, STT, embeddings) via the Spraay x402 gateway. One HTTP request, one USDC payment, the result comes back. No keys, no signup.
  2. Compute futures — prepay USDC into a credit balance and draw it down per job at a tier discount (up to 15%). Settle once, run many jobs with no per-call payment, refund whatever is left.

Everything settles in USDC over x402 V2 on Base mainnet and Solana mainnet. The gateway returns a standard HTTP 402 Payment Required with payment requirements; the agent pays via its x402 client and retries. Base address 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913 (USDC); Solana mint EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v.

Base URL: https://gateway.spraay.app

When to use which

  • One-off or low-volume job → call the relevant compute endpoint directly and pay per call.
  • Repeated jobs / known budget / cost-sensitive agent → open a compute-futures account once, then execute against the balance. Cheaper (tier discount), and each job costs only a $0.001 settlement instead of the full per-call gate price.
  • Unsure of cost → hit POST /api/v1/compute/estimate (free) first, or GET /api/v1/compute-futures/pricing ($0.001) for tier and per-model costs.

The x402 flow (how every paid call works)

  1. Agent sends the request (e.g. POST /api/v1/compute/text-inference).
  2. Gateway responds 402 Payment Required with accepts (price, network, payTo).
  3. Agent's x402 client signs an EIP-3009 (Base) or SPL (Solana) USDC authorization for the quoted amount.
  4. Agent retries with the X-PAYMENT header (Base) / X-Solana-Tx header (Solana).
  5. Gateway verifies/settles via the facilitator and returns the result.

Any x402-aware client handles this automatically (@x402/fetch, x402-axios, the Spraay MCP server, or an OpenClaw payment skill such as ClawPay/Vault-0). The agent only needs a funded wallet.

Quick endpoint reference

Prices below are the x402 gate price per call (what the agent pays at the 402). Free endpoints need no payment.

GPU / Compute rental

EndpointMethodPricePurpose
/api/v1/gpu/runPOST$0.06Run any Replicate model (image, video, LLM, audio, utility)
/api/v1/gpu/status/:idGET$0.005Poll an async GPU prediction
/api/v1/gpu/modelsGETfreeList GPU model shortcuts

Model inference

EndpointMethodPricePurpose
/api/v1/compute/text-inferencePOST$0.03LLM chat/completion — 11 models 3B–405B (Chutes AI / Bittensor SN64, OpenRouter)
/api/v1/compute/image-generationPOST$0.03Text-to-image — FLUX Schnell/Dev/Pro, SDXL
/api/v1/compute/video-generationPOST$0.50Text-to-video — MiniMax Video 01, Wan 2.1 (async)
/api/v1/compute/text-to-speechPOST$0.03TTS / voice synthesis
/api/v1/compute/speech-to-textPOST$0.02Whisper Large V3 transcription, 100+ languages
/api/v1/compute/embeddingsPOST$0.005Text/vector embeddings for RAG and semantic search
/api/v1/compute/batchPOST$0.05Up to 50 mixed jobs in one payment, 10% batch discount
/api/v1/compute/status/:jobIdGET$0.001Poll an async compute job (video, batch items)
/api/v1/compute/modelsGETfreeList all compute models with pricing
/api/v1/compute/estimatePOSTfreeEstimate cost before committing

Compute futures (prepaid credits)

EndpointMethodPricePurpose
/api/v1/compute-futures/depositPOST$0.01Open a prepaid credit account. Tiers: $10+ (5%), $50+ (10%), $200+ (15%)
/api/v1/compute-futures/balanceGET$0.001Balance, tier, discount, usage stats
/api/v1/compute-futures/executePOST$0.001Run a job, deduct from balance (no per-call x402, discount applied)
/api/v1/compute-futures/historyGET$0.002Full usage ledger
/api/v1/compute-futures/refundPOST$0.01Refund unused balance to the depositor
/api/v1/compute-futures/pricingGET$0.001Tier discounts, per-model costs, bulk-discount info

For exact request/response schemas, required fields, and model lists, read references/endpoints.md. For runnable end-to-end examples (per-call and the full futures lifecycle), read examples/quickstart.md.

Headline workflows

Rent compute (per-call), e.g. LLM inference

POST /api/v1/compute/text-inference
{ "messages": [{ "role": "user", "content": "Summarize this contract: ..." }], "model": "auto" }
→ 402 → pay $0.03 USDC → retry → { provider, model, choices: [...], usage, price_usdc }

Run a GPU model on Replicate

POST /api/v1/gpu/run
{ "model": "flux-pro", "input": { "prompt": "a serene mountain lake at sunset" } }
→ 402 → pay $0.06 USDC → retry → { id, status, model, output: ["https://replicate.delivery/..."] }

Compute futures lifecycle (prepay → draw down → refund)

POST /api/v1/compute-futures/deposit   { "depositor": "0xYou", "amount": "50" }
  → pay $0.01 → { computeFuture: { id: "CFE-ABC12345", tier: "scale", discount: "10% discount", balanceRemaining: "50 USDC" } }
POST /api/v1/compute-futures/execute   { "futuresId": "CFE-ABC12345", "type": "text-inference", "messages": [...] }
  → pay $0.001 → { billing: { charged: "$0.027", balanceRemaining: "$42.473 USDC" }, compute: { model: "Llama 3.3 70B" } }
POST /api/v1/compute-futures/refund    { "futuresId": "CFE-ABC12345", "caller": "0xYou" }
  → pay $0.01 → { refund: { refundAmount: "42.50 USDC", jobsExecuted: 15 } }

Rules and gotchas

  • Async endpoints (video-generation, some batch items) return a prediction_id / poll_url. Poll /compute/status/:jobId until status: "completed".
  • execute only deducts from a prepaid balance — it does not run a per-call x402 payment for the compute itself; you only pay the $0.001 settlement. Make sure the futures account has enough balance or the job is rejected.
  • Refunds are depositor-only. caller must equal the original depositor.
  • Use auto for model when you don't care which model serves the request; the gateway routes to a sensible default for that job type.
  • Free before paid. compute/estimate, compute/models, gpu/models, and /.well-known/x402.json cost nothing — use them to plan a call before spending.
  • Discovery: the gateway publishes a machine-readable catalog at https://gateway.spraay.app/.well-known/x402.json. Point a discovery-driven agent there to enumerate live endpoints and prices.

Provenance

This skill wraps the Spraay x402 Gateway compute surface (GPU/Compute, Compute Services, and Compute Futures / Category 22). Prices reflect the live gateway gate prices in USDC. If the gateway updates pricing or adds models, regenerate references/endpoints.md from /.well-known/x402.json.