Install
openclaw skills install @riffkit/riffkitRiff winning short videos — give one source (a TikTok link, an uploaded video, or an analyzed template) and the backend riffs its emotion formula into your own AI video (post-ready short-form or UGC-style ad creative), with optional digital character, product placement, and language. You riff the formula, not the video. Triggers: the user says 'riff this video', 'turn this TikTok into mine', 'make a video with this product', 'make an ad' / 'make an ad creative' / 'a UGC ad for my product', 'make a promo / marketing video for my app or product', 'remake a viral video', 'generate a short video', 'riff', 'riffkit', or sends a product image / viral link wanting a short video.
openclaw skills install @riffkit/riffkitCore stance: you riff the formula, not the video. Give one winning source; the backend analyzes the emotion formula that hijacks attention and migrates that formula onto your own content. The footage can be completely different as long as the viewer travels the same psychological path.
One screen, one action: source (required) → optional settings → submit. The real product is a single page and a single call (POST /api/riffs). Every setting other than the source has a sensible default — character defaults to Auto (no digital human; the AI generates the on-camera person) and product defaults to none. When the user doesn't care, the agent applies defaults silently instead of dragging them through a multi-step wizard.
The agent's highest-value contribution is content_anchor (the creative direction) — the one degree of strategic freedom: which of the product's N selling points to angle on, which surface to fill into the template's emotion mechanism. It is an optional collaboration, not a blocking hard-stop. See ## content_anchor drafting framework below.
This skill makes riff videos only: it analyzes a source video's emotion formula and regenerates it as your own AI video. That is the entire product surface. If a user asks for something outside this — a different content format, or a feature this product doesn't have — say plainly that this product only makes riff videos; don't call unrelated APIs and don't steer them elsewhere.
No staff/admin features are exposed. This skill covers only endpoints a normal authenticated user can call. Building platform templates by analyzing new sources, publishing/unpublishing platform templates, cross-scope task search, manually granting/clawing back credits — all staff-only. This document never lists them and the agent never calls them.
Output follows the user's input language: reply in English to English, in Simplified Chinese to Chinese; for mixed input, follow the dominant language of the current message. The agent's internal reasoning is exempt.
Always keep verbatim (do not translate): field IDs, API paths, template_type (only pipeline), status enums (queued/running/completed/failed/dead/cancelled), product_visibility values (on_camera/off_camera/no_product), parameter names, the vee_session token.
[Flow]
1. Pick the source (exactly one, required)
├── analyzed template formula_id → skips analysis, generates now (fastest)
├── TikTok link tiktok_url → backend downloads + analyzes + generates
└── uploaded video video (≤100MB, ≤ render cap) → backend analyzes + generates
↓
2. Optional settings (all defaulted; agent may suggest, never forces)
character default Auto (AI-generated person); may suggest a fitting character on account intent
product default none (no_product); attach an existing/new product to place one
visibility default on_camera; only meaningful when a product is attached
language default en; candidates from GET /api/languages (currently en / es)
content_anchor optional creative direction; agent may proactively draft one for review
user_hint optional hook hint; only used for a NEW source (ignored for a template)
↓
3. Confirm before submit (the only hard-stop) → POST /api/riffs
↓ ↳ insufficient balance returns HTTP 402 (structured); handle per "Billing & balance"
4. Monitor GET /api/tasks/batch/{batch_id} (every 10-15s)
↓
5. Deliver GET /api/assets → download links + caption + hashtags + strategy recap
The only hard-stop is that one pre-submit confirmation (the financial commitment). Character, product, and content_anchor are optional collaborations and never block the flow.
Endpoints at a glance:
| Endpoint | Purpose |
|---|---|
GET /api/auth/me | Check auth state |
POST /api/skill/device/authorize · POST /api/skill/device/token | One-click sign-in (device authorization; no token paste) |
POST /api/riffs | One-shot riff (the preferred, near-only generation entry) |
GET /api/formulas | List analyzed templates (one of the sources) |
GET /api/formulas/{id} | A template's extraction_summary (what was extracted) |
POST /api/formulas/{id}/refresh-analysis | Re-analyze a template whose analysis is stale |
GET /api/characters | Digital characters (optional binding) |
GET /api/products · POST /api/products · POST /api/products/{id}/images | Products + product images (optional placement) |
GET /api/languages | Video language candidates |
GET /api/tasks/batch/{id} · GET /api/tasks/{id} | Progress polling |
GET /api/tasks · GET /api/tasks/stats | List / count tasks |
POST /api/tasks/{id}/cancel · POST /api/tasks/{id}/retry | Cancel / retry |
GET /api/assets | Fetch finished videos (video + caption + hashtags) |
GET /api/usage/credits | Balance |
GET /api/billing/plans · GET /api/billing/subscription | Plan catalog / current plan (for post-402 upsell) |
Full params and responses in "API reference" below.
The agent never submits on its own. It stops once for explicit consent before submitting. Everything else may proceed on defaults.
Do NOT:
content_anchor as a hard-stop that must be iterated to the user's satisfaction before continuing (it's an optional collaboration)Do:
POST /api/riffstopup_url verbatim, no retry, no silent failureGET /api/usage/credits when the user actively asks "how much will this cost / how much do I have left"Until the user says "submit / generate / riff / go", you are a collaborator that drafts and presents a plan — not a command executor.
| Layer | Role | Locked by | Freedom |
|---|---|---|---|
| Formula + skeleton | Floor guarantee — a validated emotion mechanism + camera language | At template analysis | None (changing it forfeits the riff's value) |
| Character + product | Base constants — the digital human + product facts | Chosen in settings (or default) | Different picks = different constants, but constant within one task |
| content_anchor | Ceiling driver — which selling-point angle, which surface to fill | Agent + user draft it (optional) | The one degree of strategic freedom |
The formula skeleton decides which psychological path the viewer walks; content_anchor decides what specific content fills that path. The other layers are pre-existing constants, so the agent's differentiated value is fusing "source formula × product/account × character" into one concrete creative instruction.
| Source | Param | When |
|---|---|---|
| Analyzed template | formula_id | The user wants an existing template, or has riffed this source before — skips analysis, fastest/cheapest (analysis is free but still takes time) |
| TikTok link | tiktok_url | The user dropped a viral link; the server auto-downloads the video + extracts BGM |
| Uploaded video | video | The user has a local file (≤100MB, and within the render-duration cap — see General constraints; a longer source is rejected, not trimmed) |
GET /api/formulas?status=analyzed&template_type=pipeline. visibility=public are platform-curated templates (usable across scopes, prefer recommending them); a template with analysis_prompt_is_latest=false has stale analysis — suggest refresh-analysis before using it.Each can be left alone on its default; the agent may suggest where helpful but never blocks.
Character (default Auto)
character_ids is empty = Auto mode: no digital human bound, SD2 generates the on-camera person. This is the product default, not an edge case.GET /api/characters and match by persona feel + gender / age_range, then suggest one. Only suggest characters with has_any_active_avatar=true (a false character can't generate video yet; the user must approve its avatar in Settings first).Product (default none)
product_id is empty = no_product mode: pure content, the caption never mentions a product name or product CTA, the whole video just runs the template's emotion formula. Good for growth / relatability / educational content.GET /api/products, take the product_id.POST /api/products write until just before submit (don't leave a half-baked product in the DB before the plan is settled).POST /api/products/{id}/images). Every image must have a name — to put a specific image on camera, write that image's name directly in content_anchor text (see below); an unnamed image can't be referenced.Visibility product_visibility (only meaningful with a product; default on_camera)
| Value | Meaning | Best for |
|---|---|---|
on_camera (default) | Product appears as a physical object on screen (character holds / scans / shows it) | Food / cosmetics / small physical goods / packaging as the core hook |
off_camera | Product never enters frame; conveyed only via subtitles / voiceover / caption text | Apps / websites / SaaS / services / non-portable goods |
When product_id is empty this field is ignored and the backend derives no_product. The caller may not pass no_product directly (only the two literals on_camera / off_camera are accepted). The script and visual staging differ greatly across modes, so when a product is bound always state the value and the reason at the confirmation step.
Language (default en)
GET /api/languages (currently en / es — overseas-first; Mandarin output is not exposed for now). Trust the endpoint, don't hardcode.content_anchor (optional creative direction) + user_hint (optional hook hint)
content_anchor is the agent's highest-value contribution: it may proactively draft one for the user to review (see ## content_anchor drafting framework). If the user doesn't want one, leave it empty — the video still generates.user_hint feeds only a new source's analysis ("this popped off on the twist at 0:03"); it's ignored when a formula_id is chosen, so don't send it then.Restate the plan, no credits, no balance pre-check:
Ready to riff:
├── Source: [template name / TikTok link / uploaded filename]
├── Character: [name / Auto (AI-generated person)]
├── Product: [name + visibility / none]
├── Language: [en / es]
└── content_anchor: [drafted creative direction / none]
When the user says "submit / generate / riff" → call POST /api/riffs.
POST /api/products (+ upload images serially) to get the product_id, then include it in the riff.insufficient_credits) → handle per "Billing & balance".batch_id (the analyze task and the chained generation task both carry it) → poll GET /api/tasks/batch/{batch_id}.failed/dead, read error to locate the cause, don't auto-retry, tell the user and let them decide; if queued for over 2 minutes, note "server is at its concurrency cap (10), please wait."GET /api/assets?asset_role=final_reel&sort=created_desc&limit=10 (add formula_id / character to filter this run):
${BASE_URL}${file_url} (direct video link)caption (hook → body → closing call-to-action folded into one paragraph) + asset_hashtagsGET /api/tasks/{task_id}/content.The formula and skeleton decide which psychological path the viewer walks;
content_anchordecides what specific content fills that path. When non-empty it is the highest-priority input for surface direction. Failure test: if swapping the surface for any other topic still holds, the anchor never anchored the output → invalid.
Drafting template:
[a specific emotion-mechanism beat of the template] × [a specific feature of the product/account] → [the viewer mind-shift you want]
All three variables must be specific to an actionable level — anything abstract is as good as empty.
| ✅ Focus on | ❌ Don't (lives elsewhere or zero-info) |
|---|---|
| The specific product × template join ("the scan feature × the reveal beat at segment 2") | Product generalities ("show the product's strengths") |
| The angle you want this time (which of N selling points) | Template generalities ("use the funny formula") |
| One specific face of the audience's pain point | Account positioning ("health niche" — already in persona) |
| The viewer mind-shift ("from 'I assumed it was safe' to 'a quick scan reveals hidden additives'") | Generic creative words ("authentic / real / heartfelt") |
Where the anchor's weight goes per mode:
| Mode | content_anchor weight |
|---|---|
on_camera | Product visual feature × the template's on-screen action ("the package-scan gesture × the reveal beat's curiosity→surprise") |
off_camera | Product function/benefit × the template's voiceover/subtitle ("the pain the app solves × the hook's resonance → download urge") |
no_product | The account's specific angle × the template's emotion formula → the resonance you want (the anchor matters most here — with no product, it's the only thematic anchor) |
Place a product image on camera by name (on_camera only): write the product image's name directly in content_anchor text and the engine matches that name and places the image on screen. The image must be named (an unnamed image can't be referenced). Example: writing in content_anchor "use the ingredient-scan screen shot to reveal the hidden additives" puts the image named "ingredient-scan screen" into the matching shot. (This is plain name matching, not an @-syntax — the @-mention is only a web-UI textarea helper that inserts the name for you; agents write the name themselves.)
BASE_URL = https://riffkit.ai
Content-Type: application/json; charset=utf-8 (except multipart endpoints)
Auth: cookie-based session (vee_session)
Every path below already includes the full prefix — just append it to ${BASE_URL} (e.g. GET /api/auth/me → https://riffkit.ai/api/auth/me).
⚠️ Request bodies must be UTF-8. Python requests.post(url, json=...), Node fetch/axios, Go json.Marshal are UTF-8 by default — pure-ASCII needs nothing. Only on Chinese Windows cmd run chcp 65001 first (PowerShell also needs [Console]::OutputEncoding = [System.Text.Encoding]::UTF8), or non-ASCII characters get sent as GBK and rejected with BAD_REQUEST. Never assemble a byte string with data= in any language.
The API uses a cookie-based session (vee_session). Never ask for a password in chat. The agent obtains a session through a one-click device-authorization flow — the user just opens a link and clicks Approve, and the session flows back automatically. No token is ever pasted into chat. (Same UX as gh auth login.)
GET /api/auth/me → 200 logged in / 401 not.POST /api/skill/device/authorize (no body, no auth needed) → {device_code, user_code, verification_uri, verification_uri_complete, expires_in, interval}.Open this and click Approve — I'll connect automatically:
<verification_uri_complete>
(confirm the page shows this code before approving: <user_code>)
POST /api/skill/device/token with {"device_code": "<device_code>"} every interval seconds (default 5s):
{"status":"authorization_pending"} → keep polling{"status":"approved","token":"<t>"} → done; use Cookie: vee_session=<t> on every later request{"status":"expired"|"denied"|"invalid"|"consumed"} → stop and start over with a fresh authorizeexpires_in (10 min) and tell the user the link expiredCookie: vee_session=<value> to every subsequent request.The device flow is the only sign-in path — the token never gets pasted into chat. (Settings → AI Agent mode shows the same one-click steps.)
GET /api/auth/me200 → UserOut / 401 → unauthenticated.
| Field | Type | Notes |
|---|---|---|
id | string | User ID |
email | string | Email (= identity; no separate name) |
role | string | Scope role: owner / admin / member |
is_active | boolean | Active |
is_staff | boolean | Product-level staff (default false) |
scope_id | string? | Owning scope |
daily_credits_limit | float | Daily credit cap (0 = unlimited) |
created_at / last_login_at | datetime | Created / last login |
POST /api/skill/device/authorize — start one-click sign-inNo body, no auth. Response:
| Field | Type | Notes |
|---|---|---|
device_code | string | Secret — the agent polls with it; never show it to the user, never write it anywhere |
user_code | string | Short code shown to the user (they confirm it matches the approval page) |
verification_uri | string | Approval page (bare) |
verification_uri_complete | string | Approval page with the code pre-filled — give the user this link |
expires_in | int | Seconds until the flow expires (600) |
interval | int | Seconds to wait between polls (5) |
POST /api/skill/device/token — poll for the sessionBody: {"device_code": "<device_code>"}. Response {status, ...}:
status | Meaning | Action |
|---|---|---|
authorization_pending | User hasn't approved yet | Wait interval seconds, poll again |
approved | Approved — response also has token | Use Cookie: vee_session=<token>; stop polling |
expired / denied / invalid / consumed | Flow is dead | Stop; start over with a fresh authorize |
The minted
tokenis a normal session (identical to a browser login). Treat it like a credential: never echo it, never store it in a task/caption/product field.
POST /api/riffs — one-shot riff (preferred entry)Content-Type: multipart/form-data
Source (exactly one):
| Param | Type | Notes |
|---|---|---|
video | File | Upload source video (≤100MB, and ≤ the render-duration cap — default 45s; see General constraints) |
tiktok_url | string | TikTok link (server downloads + extracts BGM) |
formula_id | string | Analyzed template ID (yours or a public one; status must be analyzed, else 400) |
Optional creative config:
| Param | Type | Default | Notes |
|---|---|---|---|
character_ids | string | "" | JSON array string ('["caden","chloe"]') or comma-separated (caden,chloe). Empty = Auto mode (no digital human, SD2 generates the person); non-empty = one task per character. Note it's a string, not an array (multipart limitation) |
product_id | string | "" | Empty = no product placement (no_product mode) |
product_visibility | string | on_camera | on_camera / off_camera; only effective when product_id is non-empty (ignored when empty) |
language | string | en | Must be a code from GET /api/languages (currently en / es); an invalid value returns 400 |
content_anchor | string | "" | Creative direction (≤5000 chars); to place a product image on camera, write that image's name in the text (on_camera; plain name match) |
user_hint | string | "" | Hook hint (≤5000); new source only — ignored when formula_id is given |
Response (RiffOut):
| Field | Type | Notes |
|---|---|---|
mode | string | "generate" (formula_id already analyzed → generation batch submitted now) / "analyze_then_generate" (new source → analyze submitted first; on completion the worker chains the generation) |
batch_id | string | The riff's handle — the analyze task and chained generation task share it; poll GET /api/tasks/batch/{batch_id} to track the whole run |
formula_id | string | Template ID (a new source creates a placeholder-named template, auto-renamed by a hook once analysis lands) |
analyze_task_id | string? | Analyze task ID (only in analyze_then_generate) |
task_ids | string[] | Generation task IDs (immediate in generate; in the chained mode they appear after analysis, fetched from the batch) |
Behavior notes:
POST /api/pipeline/batch — riff video (advanced / analyzed-template batch)riffs already covers nearly everything (including multi-character batches). This endpoint remains for fine-grained "analyzed template + explicit params" control; the agent rarely needs it.
| Field | Type | Req | Default | Notes |
|---|---|---|---|---|
formula_id | string | ✓ | Template ID (status must be analyzed) | |
character_ids | string[] | [] | Character ID array (an array here, unlike riffs' string). Empty array = Auto mode | |
product_id | string | null | null | null/omitted = no_product | |
product_visibility | string | on_camera | Only on_camera/off_camera; no_product is derived from product_id=null, never passed directly | |
content_anchor | string | "" | ≤5000 chars | |
language | string | ✓ | Must be a code from GET /api/languages |
Response (PipelineBatchResponse): batch_id / task_ids[] / total.
GET /api/formulasQuery: status (collected/analyzed/archived), template_type (use pipeline), tags (comma-separated), search, sort (created_desc/created_asc/used_desc), limit (default 50), offset.
Response (FormulaListOut): items: FormulaOut[] / total / limit / offset (header X-Total-Count = filtered total).
FormulaOut (customer-visible fields):
| Field | Type | Notes |
|---|---|---|
id / name | string | Template ID / name |
template_type | string? | Default "pipeline" (this skill only consumes this; filter out others) |
status | string | collected / analyzed / archived |
emotion_arc | string? | Emotion arc (generic funnel-stage sequence, e.g. "hook → build-up → cta") |
slot_count | int? | Number of formula segments |
used_count | int? | Times riffed (high = peer-validated) |
tags | string[] | Tags |
source_url / source_platform | string? | Original link / platform |
thumbnail_url | string? | Thumbnail |
analysis_prompt_is_latest | bool | false = analysis stale; refresh-analysis before using |
visibility | string | scope (this scope only) / public (platform-curated, prefer recommending) |
created_at | datetime? | Created |
hook_type/cta_typeand other formula-internal fields are engine IP and not exposed to customers.
GET /api/formulas/{formula_id} — extraction summaryResponse (FormulaDetailOut): all FormulaOut fields + analysis_card (always null for customers — the full formula DSL is engine IP) + extraction_summary (a safe projection).
extraction_summary (the agent reads this to understand the template, NOT analysis_card):
| Field | Type | Notes |
|---|---|---|
duration_seconds | float? | Source video length |
language | string? | Source language |
speaking_mode | string? | Speech form |
narrative | string | Plain-language record of "what happens" in the source (internal markers stripped) |
transcript | [{at, text, mode}] | Line-by-line dialogue (at = start second, mode = on_camera_dialogue/voiceover/no_dialogue) |
on_screen | [{kind, label, at}] | On-screen entity timeline (kind = person/subtitle/product_image/... label = human label) |
Reading the emotion formula: skim narrative + transcript for the emotional journey (how it hijacks attention, which state the viewer is moved from/to); combine with emotion_arc for funnel stages. Don't look for per-segment mechanism fields — that's the engine's core IP and isn't exposed; the agent reasons from narrative/transcript instead.
POST /api/formulas/{formula_id}/refresh-analysisWhen a template's analysis is stale (analysis_prompt_is_latest=false), re-run Stage A under the current prompt version. Response: task_id + "queued"; poll GET /api/tasks/{task_id}. It does not accept user_hint — to change the hint, riff a new source to build a fresh template.
Customers have no standalone "analyze a new video" entry — new sources go only through
POST /api/riffs(analyze→generate, paid); the library can only re-analyze existing templates.
GET /api/charactersResponse: CharacterOut[].
| Field | Type | Notes |
|---|---|---|
id / slug | string | Immutable identifier (same value; slug is the canonical name) |
name | string | Display name |
gender | string? | female / male / null |
age_range | string? | young / middle_aged / senior / null |
persona | string? | Free-text account identity — the single source for positioning / audience / tone |
reference_image | string? | Reference image path |
has_any_active_avatar | bool | The hard test for "can generate video" — true if any main avatar in history passed review (default false) |
has_any_processing_avatar / has_any_failed_avatar | bool | Has an avatar in review / failed |
seedance_asset | object? | The in-use / most-recent avatar's review record (status: processing/active/failed) |
active_avatar_id | string? | The in-use avatar row id |
stats | object? | Asset stats (total_assets / by_type) |
Choose a character by
personafeel +gender/age_range+has_any_active_avatar. Creating/editing characters (needs reference_image + persona) is left to the Settings UI; Riffkit doesn't proactively guide creation. There is nodescriptionfield (account identity lives entirely inpersona).
GET /api/products → ProductOut[]| Field | Type | Notes |
|---|---|---|
id / name | string | Product ID / name |
description | string? | Product description (the single source of product fact; Stage B infers category/tags from it as needed) |
target_audience | string? | Target audience |
images | ProductImageOut[] | Product images |
ProductImageOut:
| Field | Type | Notes |
|---|---|---|
id | string | Image ID |
url | string | Image URL |
name | string | Image name — write this name in content_anchor text to place the image on screen (on_camera). An unnamed image can't be referenced |
description | string | Image description |
usage_context | string | User-written "when to use this image" |
content_policy | string | locked (default, AI cannot edit it) / mutable (editable) |
caption_status | string | Background vision-captioning state: pending (generating) / "" (done) — don't treat pending as an error |
POST /api/productsBody (ProductUpdateRequest): name (✓), description (✓), target_audience. Response: ProductOut.
POST /api/products/{product_id}/imagesAdd an image. URL or file (either/or). Max 8 images per product, ≤50MB each, .jpg/.jpeg/.png/.webp.
Form:
| Field | Type | Req | Notes |
|---|---|---|---|
file | File | either/or | Upload image |
image_url | string | either/or | Public http(s) image URL |
name | string | ✓ | Image name (required on new upload; non-empty names are unique per product, trimmed, case-insensitive) |
image_id | string | Custom id (else derived from filename/URL; reserved words protagonist / supporting_a~z not allowed) | |
description | string | Image description (left blank → background auto-captioning, caption_status=pending meanwhile) | |
usage_context | string | When to use this image | |
content_policy | string | locked (default) / mutable |
Response: ProductOut (the full updated product).
Multi-image upload must be serial (one awaited after another, not parallel) — the backend does read-modify-write per product, and parallel uploads race and drop images.
GET /api/languages → Language[]| Field | Type | Notes |
|---|---|---|
code | string | BCP-47 code (en / es) to put in the language field |
name | string | Display name (English / Español) |
Currently
en/es(overseas-first; Mandarin output not exposed for now — adjusts with the product, so trust this endpoint's response, don't hardcode).riffsandpipeline/batchshare the same candidate set.
GET /api/tasks/{task_id} → TaskOut| Field | Type | Notes |
|---|---|---|
id | string | Task ID |
type | string | pipeline (riff generation) or analyze (template analysis) |
status | string | queued → running → completed / failed / dead / cancelled |
progress | int | 0-100 |
current_step | string? | Current step (e.g. "Stage B — creative adaptation") |
error | string? | Failure reason (sanitized + truncated) |
character_id / formula_id / formula_name / product_id | string? | Linked entities + template-name snapshot |
batch_id | string? | Batch ID |
product_visibility | string? | on_camera / off_camera / no_product (config replay) |
language | string? | Language code |
content_anchor | string? | Creative direction |
user_hint | string? | Hook hint (analyze tasks only) |
segment_count | int? | Number of video segments (pipeline only) |
submitted_by_user_id | string? | Submitter user_id (in a team scope, resolve to a member via /api/scopes/{id}/members; a solo scope = the owner) |
result | any? | On success, contains asset_id etc. |
created_at / started_at / finished_at | datetime | Timestamps (naive UTC; parse as UTC on the frontend) |
GET /api/tasks/batch/{batch_id} → BatchStatusOutbatch_id / total / completed / failed / running / queued / tasks: TaskOut[]. Preferred for tracking a whole riff.
GET /api/tasks — list tasksQuery: status (single or comma-separated allowlist like failed,dead), type (use pipeline), date_from / date_to (YYYY-MM-DD or full ISO 8601; Task.created_at is naive UTC), submitted_by_user_id (filter by submitter, only meaningful in a team scope), limit (default 100, 1-500), offset. Header X-Total-Count.
Response: TaskOut[]. Usage: "how many are running now" → ?status=running; "last 10 failures" → ?status=failed&limit=10; "today's tasks" → ?date_from=2026-06-20&date_to=2026-06-20.
GET /api/tasks/statsCounts grouped by type / status (for tab badges). Query: date_from / date_to / submitted_by_user_id (not status/type — those are the grouping dimensions). Response: total / by_type (e.g. {"pipeline":12}) / by_status (e.g. {"completed":9,"failed":2}).
POST /api/tasks/{task_id}/cancelCancel a queued/running task (other states → 409/400). Marks it cancelled (not failed); does not interrupt a running subprocess (it exits after the current step); external calls already made are charged and not refunded. Usage: on "stop it" → call and clearly say "what's already charged isn't refunded; running sub-steps finish the current stage before stopping." Don't proactively suggest cancelling unless a task is clearly hung.
POST /api/tasks/{task_id}/retryRetry a failed/dead task (retryable within 24h and only if the schema version matches). Creates a new worker with the same config = full re-charge. dead is usually a task reaped by a container restart, and retry is the only recovery. Usage: on "run it again" → first state the estimated credit cost (GET /api/usage/credits + duration estimate) → let the user decide; if the failure was user-fixable (bad product image / stale template analysis), fix the cause first.
GET /api/tasks/{task_id}/content — extraction/rewrite preview (optional)Review what the engine "extracted / rewrote" for a task, for the delivery strategy recap. Response (TaskContentOut): extraction (an analyze task's extraction, same shape as extraction_summary), rewrite (a generation task's rewrite: story / dialogue / caption / hashtags), template_name, content_anchor, user_hint.
GET /api/assetsQuery: asset_id (string[]), type (pipeline = generated video / upload = reference material), asset_role (final video = final_reel), character (string[]), product_id (string[]), formula_id (string[]), created_window (today/7d/30d/90d), sort (created_desc/created_asc/character_az/product_az), page (≥1), limit (1-200, default 50).
Response: AssetOut[].
| Field | Type | Notes |
|---|---|---|
id / type / asset_role / name | asset_role=final_reel is the finished riff | |
character_id / formula_id / formula_name / product_id | string? | Linked entities |
file_url | string? | Download path (append ${BASE_URL}) |
thumb_url | string? | Thumbnail |
sd_video_url | string? | Raw pre-post-processing SD video (only when the final had post-processing) |
caption | string? | Suggested copy (hook → body → closing CTA in one paragraph) |
asset_hashtags | string[] | Suggested hashtags |
batch_id / task_id | string? | Source batch / task |
metadata / extra_metadata | dict | Metadata |
created_at | datetime | Created |
POST /api/assets/upload (sidecar; not used by the main flow)Riff videos are derived from template + product + character — no manual material upload is needed. This endpoint only ingests user-provided reference videos/images. Form: file (✓, video ≤100MB / image ≤50MB), asset_role (✓, reference), product_id / character_id / name / notes (optional).
To download a finished video: just GET
${BASE_URL}${asset.file_url}.
Billing rules (use this framing when explaining to users): charged only by successfully generated video seconds — 720p is 10,000 credits/s (≈$1/s), 1080p is 15,000 credits/s (≈$1.5/s); analysis is free (re-riffing the same source reuses the cached analysis); you pay only for video seconds actually generated — a run that produces no video output costs nothing, but any seconds already rendered (including on cancel or a later-stage failure) are charged and not refunded. One standard video ≈ 15s @720p ≈ 150,000 credits. Subscription credits are valid for the period and don't roll over. Get the exact rate from
video_credits_per_secondonGET /api/billing/subscription— don't hardcode.
402 handling (hard constraint): when submit (riffs / pipeline/batch) lacks balance, it returns HTTP 402 with a structured detail:
| Field | Notes |
|---|---|
error | always "insufficient_credits" |
required_credits / available_credits | raw credits — convert to seconds (below); never shown to the user as-is |
topup_url | the upgrade link the backend issues — relay it verbatim, don't build a URL yourself |
On 402: no retry, no silent failure — present the shortfall in seconds ONLY (the only unit the app shows users — never credits, never USD): seconds ≈ credits ÷ video_credits_per_second (≈10000 @720p; exact value from GET /api/billing/subscription), e.g. "this riff needs ~15s but you only have ~8s left." Relay topup_url verbatim, and optionally call GET /api/billing/plans to introduce upgrades (instant, prorated against the remaining period). This is the only time you proactively mention balance.
GET /api/usage/credits — check balance| Field | Type | Notes |
|---|---|---|
available | float | Available credits (= total_remaining - held) — the only field for "can I submit" |
held | float | Total held by in-flight tasks |
total_remaining | float | Total unspent credits (including held) |
daily_spent / daily_limit | float | Spent today / daily cap (0 = unlimited) |
ledgers | array | Per-batch detail (type / remaining / held / expires_at); expiry-first deduction is transparent to the agent |
In a team scope this returns the owner's balance (shared by members);
daily_spent/daily_limitare computed for the calling member's own daily allowance.
GET /api/billing/plans — plan catalogNo params. Returns [{id, name, price_usd, credits, seconds, videos, unit_price_usd, purchasable}] (starter/daily/pro; purchasable=false = payments not configured).
GET /api/billing/subscription — current planNo params. Returns {plan_id?, plan_name?, status?, scheduled_plan_id?, cancel_at_period_end, current_period_start?, current_period_end?, stripe_enabled, video_credits_per_second}. plan_id=null = unsubscribed. Buying/upgrading/downgrading is a web action (Settings → Billing); the agent only guides, never orders.
Also available:
GET /api/usage/daily-budget(allowed/spent_credits/limit_credits/remaining_credits, the simple pre-submit gate),GET /api/usage/summary(usage aggregated by period,total_credits/total_cost_usd;groupsis empty for customers),GET /api/usage/history(daily history withuser_email; non-admins can only query themselves). Use summary for "how much have I used," credits for "can I generate again." Always report usage/spend to customers in seconds — the unit the app's meter shows —seconds ≈ total_credits ÷ video_credits_per_second(fromGET /api/billing/subscription); never surface raw credits / USD (matches the seconds-first billing UI + brand voice).
GET /api/scopes/{scope_id}/membersList scope members (you must be a member, else 403). Returns [{id, scope_id, user_id, role, daily_credits_limit, joined_at, user_email}]. Use it to resolve TaskOut.submitted_by_user_id to a user_email in a team scope. Not needed in a solo scope.
| Dimension | Limit | Source |
|---|---|---|
| Source video upload | ≤ 100 MB and ≤ the render-duration cap (max_render_duration, default 45 s, runtime-adjustable, ceiling 90s) — the SAME single number that caps the generated video, not a separate limit; over the duration → instant 400 + cleanup | POST /api/riffs video, assets/upload |
| Generated video length | ≤ max_render_duration (the same single cap as the source upload above) | engine render budget |
| Image upload | ≤ 50 MB each, ≤ 8 images per product, .jpg/.jpeg/.png/.webp | product images |
content_anchor / user_hint | ≤ 5000 chars | riffs / pipeline/batch |
| riffs rate | 10 / 60 s | POST /api/riffs |
| Task concurrency | 10 (overflow → queued) | server TaskRunner |
| Generation time | pipeline 3-8 min (empirical) | — |
| Poll interval | 10-15 s | Step 4 |
| Daily credit cap | daily_credits_limit (0 = unlimited) | adjustable by owner/admin |
BGM is handled by the backend automatically — the riff flow has no audio upload step.
| Intent | Example | Action |
|---|---|---|
| One-shot riff | "riff this link", "make me one from this video" | POST /api/riffs (source + optional config; confirm before submit) |
| Riff a new viral | "why did this TikTok pop off — riff it for me" | POST /api/riffs (pass tiktok_url/video → analyze→generate) |
| Run an existing template | "make one with template 3" | POST /api/riffs (pass formula_id) |
| Browse templates | "what templates are there", "which is hot lately" | GET /api/formulas?status=analyzed&template_type=pipeline (by used_count / tags) |
| Drill into a template | "tell me about this one", "why recommend it" | GET /api/formulas/{id}, read extraction_summary |
| Re-analyze an old template | "this is stale", "re-run analysis" | POST /api/formulas/{id}/refresh-analysis |
| Add a product / image | "I have a new product", "add an image to the product" | Restate + confirm → POST /api/products / POST /api/products/{id}/images (serial) |
| Pick language | "make it in Spanish", "switch language" | GET /api/languages for candidates → set language |
| On / off camera / none | "should the product show", "I don't want a product, just growth" | Explain product_visibility (incl. no product_id = no_product) + recommend a value |
| Check progress | "how's it going", "done yet" | GET /api/tasks/batch/{batch_id} or GET /api/tasks/{id} |
| Get results | "give me the download link" | GET /api/assets?asset_role=final_reel&... |
| Check balance / spend | "how much is left", "how much today" | GET /api/usage/credits / GET /api/usage/summary?period=today |
| Stop a task | "stop it", "cancel" | POST /api/tasks/{id}/cancel (note no refund of what's charged) |
| Retry | "try again", "re-run" | POST /api/tasks/{id}/retry (state estimated cost, confirm first) |
Routing principle: when intent is ambiguous, ask — don't guess and proceed.
| State | Meaning | Keep polling? |
|---|---|---|
queued | Submitted, waiting for a TaskRunner slot | ✅ |
running | Executing | ✅ |
completed | Output persisted; result carries asset_id | ❌ (fetch assets) |
failed | Normal failure (LLM error / API rate limit / …); error has the cause | ❌ (no auto-retry) |
dead | Reaped by a container restart (retry to recover) | ❌ (guide the user to retry) |
cancelled | User cancelled | ❌ |
queued → running → completed
↘ failed / dead / cancelled
| HTTP / error | Scenario | Handling |
|---|---|---|
401 unauthenticated | vee_session expired/missing | Re-run the device flow (POST /api/skill/device/authorize → user approves → poll .../token); see Auth |
402 insufficient_credits | not enough to submit | Show the shortfall in seconds (≈ credits ÷ video_credits_per_second) + relay topup_url verbatim, no retry |
400 — not exactly one source | missing or multiple sources | Ensure exactly one of video/tiktok_url/formula_id |
400 — required missing | name/description etc. not sent | Fill per the field tables; don't paper over with empty strings |
400 — invalid language | a code not in the candidates | First GET /api/languages for candidates |
400 — U+FFFD replacement char | request body not UTF-8 | chcp 65001 on Windows; use json= not data= |
413 file too large | over 100MB video / 50MB image | State the limit, recompress, re-upload |
429 rate limit | riffs > 10/60s or over the daily cap | Wait a bit; don't blindly retry |
500 / timeout | server error | Say try again later; if it recurs, report to the developers |
Task failed + error mentions "Seedance" | proxy / API failure | Surface the specific error, let the user decide |
Task queued over 2 min | concurrency full (cap 10) | Say "concurrency is full, please wait" |
The agent acts on the user's behalf and must be conservative, transparent, reversible:
/api/riffs, if suspicious (non-standard TikTok domain, possible phishing), gets a confirmation prompt first.Proactively flag anomalies (an undocumented error code / an internal field that shouldn't be exposed / the same task failing after 2 retries / balance dropping >10% in a minute for no reason).
status=queued).used_count / tags, the sharper the picks.asset.file_url is relative; full URL = ${BASE_URL}${file_url}.The riffkit skill is a general AI-agent skill — usable by any agent with "local skill loading + heartbeat scheduling" (Claude Code / Codex / others). Use placeholder paths, substituting your agent's directory convention.
# ${SKILLS_ROOT} = your AI agent's skills root, commonly:
# Claude Code project .claude/skills / global ~/.claude/skills
# Codex project .codex/skills / global ~/.codex/skills
export SKILLS_ROOT=<one of the paths above>
mkdir -p "${SKILLS_ROOT}/Riffkit" && cd "${SKILLS_ROOT}/Riffkit"
curl -sL "https://riffkit.ai/SKILL.md" -o SKILL.md
curl -sL "https://riffkit.ai/HEARTBEAT.md" -o HEARTBEAT.md
Filenames are case-sensitive: SKILL.md (this file), HEARTBEAT.md (version-check heartbeat).
One-off quick use (no local install): send the agent
curl -s https://riffkit.ai/SKILL.md; it runs the device flow (hands you a one-click approval link) to sign in — no token to copy. For long-term use, install locally + set up the heartbeat.
Check each item in order; on any failure, return to the previous step and reinstall:
ls "${SKILLS_ROOT}/Riffkit/" includes SKILL.md and HEARTBEAT.md, exact case.version equals curl -s https://riffkit.ai/SKILL.json | jq -r .version (currently: 1.1.0).curl -sS -o /dev/null -w "%{http_code}" https://riffkit.ai/api/auth/me returns 401 (no cookie is normal).curl -s -X POST https://riffkit.ai/api/skill/device/authorize returns JSON with a user_code.After self-check passes, continue to Heartbeat setup — the agent's path for auto-syncing skill updates.
The riffkit skill keeps evolving — folding it into your agent's periodic heartbeat is the robust way to auto-sync updates, or the local SKILL.md drifts out of date.
Frequency: once every 24 hours.
## Riffkit (every 24h)
> ⚠️ **Never compute Unix seconds, convert to UTC, or do the subtraction yourself.** The bash below does all the time math; just read the first stdout line's tag to branch.
1. Run the check:
```bash
STATE="${AGENT_MEMORY_DIR}/Riffkit-heartbeat-state.json"
mkdir -p "$(dirname "$STATE")"
[ -f "$STATE" ] || printf '{"lastHeartbeatCheck":null}' > "$STATE"
NOW=$(date +%s)
LAST=$(grep -o '"lastHeartbeatCheck"[^,}]*' "$STATE" | grep -oE '[0-9]+' || echo 0)
DIFF=$(( NOW - LAST ))
INTERVAL=$(( 24 * 60 * 60 ))
if [ "$DIFF" -ge "$INTERVAL" ]; then
echo "HBCHECK=DUE now=$NOW last=$LAST diff=${DIFF}s interval=${INTERVAL}s"
else
echo "HBCHECK=SKIP now=$NOW last=$LAST diff=${DIFF}s wait=$(( INTERVAL - DIFF ))s"
fi
```
2. **Branch strictly on the first stdout line's tag:**
- `HBCHECK=SKIP …` → **end the heartbeat, reply `HEARTBEAT_OK`**, do nothing else
- `HBCHECK=DUE …` → continue to step 3
3. Update the state file's `lastHeartbeatCheck` to the number after `now=` from the previous stdout (copy it; don't re-run `date`), and write the whole state back to `${AGENT_MEMORY_DIR}/Riffkit-heartbeat-state.json`.
4. Read the local `HEARTBEAT.md` and run the version check per its instructions.
Env conventions:
${SKILLS_ROOT}= your AI agent's skills root (the one exported in install step 1)${SKILL_DIR}=${SKILLS_ROOT}/Riffkit(whereSKILL.md/HEARTBEAT.mdlive)${AGENT_MEMORY_DIR}= your agent's runtime memory dir (holdsRiffkit-heartbeat-state.json). Usually~/.claude/memoryfor Claude Code,~/.codex/workspace/memoryfor Codex.The heartbeat only checks the version — no write requests. It never submits tasks or tops up credits for you.
| Intent | Example | Action |
|---|---|---|
| Check now | "check Riffkit for updates", "update the skill" | Skip throttling, read HEARTBEAT.md and run the version compare |
| Force re-download | "force-update Riffkit", "reinstall the skill" | curl-overwrite the local SKILL.md directly, no version compare |
After a manual trigger, also set lastHeartbeatCheck to the current Unix second (so the heartbeat doesn't fire again minutes later), using the same "never compute time by hand" script to read NOW and write it.