api-quality-check

v1.0.0

Check coding-model API quality, capability fit, and drift with LT-lite and B3IT-lite. Use when Codex needs to verify whether an OpenAI/OpenAI-compatible/Anth...

⭐ 0· 120·0 current·0 all-time

by@chekhovin

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for chekhovin/api-quality-check.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "api-quality-check" (chekhovin/api-quality-check) from ClawHub.
Skill page: https://clawhub.ai/chekhovin/api-quality-check
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install api-quality-check

ClawHub CLI

Package manager switcher

npx clawhub@latest install api-quality-check

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

ℹ

Purpose & Capability

The skill name/description (API quality checks for coding-model endpoints) matches the delivered files: a main Python script and two shell wrappers that run smoke tests, baseline creation, and drift detection against vendor endpoints. Nothing in the code or docs requests unrelated cloud credentials or system-wide privileges. Minor inconsistency: the runtime docs and examples rely on an $API_KEY environment variable or provider.json entries, but the registry metadata lists no required env vars—so the skill expects user-supplied API keys (in configs or env) even though none are declared in metadata.

✓

Instruction Scope

SKILL.md explicitly instructs the agent to run the bundled scripts, keep outputs file-based, and run smoke → baseline → detect flows. The scripts only reference provider configs, output paths, and optional CODEX_HOME. They do not instruct reading unrelated system files or secrets beyond the provider config / API key. The agent is instructed to use file-based artifacts (JSON/HTML) and not to collect or transmit other local data.

✓

Install Mechanism

No installer or remote download is present; this is an instruction+script bundle (no extract-from-URL installs). It requires a Python runtime and the 'requests' package at runtime, which is reasonable for a network-testing script and is not disproportionate.

ℹ

Credentials

The skill expects API keys (per-provider api_key fields or example use of $API_KEY) to talk to external model endpoints. That is proportionate to the purpose. However, the registry metadata lists no required env vars while the docs repeatedly show using $API_KEY—this mismatch should be noted. Also the config allows arbitrary custom headers and extra body fields, which is necessary for some vendors but means user-supplied secrets/headers will be sent to the configured endpoints.

✓

Persistence & Privilege

always:false and no special privileges requested. The scripts write outputs and baselines to user-specified directories only (no modifications to other skills or global agent settings). This level of presence is appropriate for a monitoring/tooling skill.

Assessment

This skill appears to do what it says: it runs headless quality/drift checks against model endpoints you configure. Before installing or running it: 1) Recognize it will send your supplied API keys and prompts to whatever base_url you provide—do not point it at untrusted endpoints. 2) The SKILL.md examples use $API_KEY but the skill metadata didn't declare required env vars; supply keys in provider.json or export $API_KEY as shown. 3) Do not commit real API keys or private provider configs to source control; use placeholder values in committed files. 4) Review provider.json/providers.json entries (base_url, headers, extra_body) to ensure headers or extra bodies don't leak sensitive tokens to unexpected domains. 5) Run the scripts in an isolated environment (or sandbox/container) if you need to limit network exposure, and ensure Python and the 'requests' package are available. If you want stronger assurance, request the author to declare required env vars (e.g., API_KEY) in the registry metadata and to document any dependencies explicitly.

Like a lobster shell, security has layers — review code before you run it.

latestvk978jq4m8nkhm4rym3x2hg1wjd83gcam

120downloads

0stars

1versions

Updated 1mo ago

v1.0.0

MIT-0

API Quality Check

Use the bundled script to run headless API-quality checks. Treat this skill as script-first: do not recreate LT-lite/B3IT-lite logic inline unless the script is clearly insufficient.

Provider names such as Ark/Volcengine, GLM, DeepSeek, Kimi, SiliconFlow, and similar services are examples only. The primary decision is the endpoint protocol type: OpenAI, OpenAI-Compatible, or Anthropic.

Quick start

Set the path once:

export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
export APIQ="$CODEX_HOME/skills/api-quality-check/scripts/api_quality_check.py"
export APIQ_BATCH="$CODEX_HOME/skills/api-quality-check/scripts/run_batch_checks.sh"
export APIQ_DAILY="$CODEX_HOME/skills/api-quality-check/scripts/run_daily_check.sh"

Run a capability smoke test first:

python "$APIQ" smoke \
  --provider "OpenAI-Compatible" \
  --base-url "https://ark.cn-beijing.volces.com/api/coding/v3" \
  --api-key "$API_KEY" \
  --model-id "ark-code-latest" \
  --html-output ./smoke.html

For many OpenAI-compatible endpoints, the same command also works if the user pastes the full .../chat/completions URL. The script will normalize it back to the API root automatically.

If you want a ready-to-run provider.json first, generate it with:

python "$APIQ" init-config \
  --provider "OpenAI-Compatible" \
  --base-url "https://api.siliconflow.cn/v1/chat/completions" \
  --api-key "$API_KEY" \
  --model-id "deepseek-ai/DeepSeek-V3.2" \
  --name "siliconflow-v3-2" \
  --config-output ./provider.json

If an endpoint requires client-specific headers, put them in the config JSON as a headers object or pass them with --headers-json. For Kimi coding endpoints, use {"User-Agent":"KimiCLI/2.0.0"} only when the address is under https://api.kimi.com/coding; for the OpenAI-compatible Kimi path, use https://api.kimi.com/coding/v1.

If you already have multiple raw endpoint entries, normalize them into providers.json with:

python "$APIQ" init-batch-config \
  --configs ./raw-providers.json \
  --config-output ./providers.json

Or run the full batch pipeline:

"$APIQ_BATCH" ./providers.json ./api-quality-out

That command also creates ./api-quality-out/index.html as the landing page for all generated reports.

For one endpoint that you want to check every day and archive by date:

bash "$APIQ_DAILY" ./provider.json ./daily-out my-endpoint

Workflow

Run smoke before any baseline or detect run.
If you have many endpoints, run batch-smoke with a config list before choosing which ones deserve deeper LT/B3IT work.
Read the result:
- b3it_supported=true: the endpoint can return normal first-token text at max_tokens=1
- lt_supported=true: the endpoint also returns logprobs, so LT-lite can run
- recommended_detector: the script's direct recommendation for the next step
If lt_supported=false, do not force LT-lite; pivot to B3IT-lite or report that LT is unavailable.
Save baselines to explicit JSON files and reuse them for later detection.
Keep outputs file-based for coding CLIs and OpenClaw. Do not depend on GUI state.
For noisy endpoints, prefer the built-in B3IT defaults before tightening or loosening thresholds manually.

Endpoint Types

OpenAI: use this for official OpenAI-style endpoints.
OpenAI-Compatible: use this for third-party endpoints that follow OpenAI request and response shapes; vendor-specific headers may be required.
Anthropic: use this for /v1/messages style endpoints; in this skill it is B3IT-only.

Commands

Capability smoke

python "$APIQ" smoke --config ./provider.json --output ./smoke.json

Generate a provider config template

python "$APIQ" init-config \
  --provider "OpenAI-Compatible" \
  --base-url "https://api.siliconflow.cn/v1/chat/completions" \
  --api-key "$API_KEY" \
  --model-id "deepseek-ai/DeepSeek-V3.2" \
  --config-output ./provider.json

Generate a batch providers.json template

python "$APIQ" init-batch-config \
  --configs ./raw-providers.json \
  --config-output ./providers.json

Batch capability smoke

python "$APIQ" batch-smoke --configs ./providers.json --output ./batch-smoke.json --html-output ./batch-smoke.html

Batch LT-lite baselines

python "$APIQ" batch-lt-baseline \
  --configs ./providers.json \
  --output-dir ./lt-baselines \
  --output ./batch-lt-baselines.json \
  --html-output ./batch-lt-baselines.html

Batch LT-lite detect

python "$APIQ" batch-lt-detect \
  --configs ./providers.json \
  --baseline-manifest ./batch-lt-baselines.json \
  --output ./batch-lt-report.json \
  --html-output ./batch-lt-report.html

Batch B3IT-lite baselines

python "$APIQ" batch-b3it-baseline \
  --configs ./providers.json \
  --output-dir ./b3it-baselines \
  --output ./batch-b3it-baselines.json \
  --html-output ./batch-b3it-baselines.html

Batch B3IT-lite detect

python "$APIQ" batch-b3it-detect \
  --configs ./providers.json \
  --baseline-manifest ./batch-b3it-baselines.json \
  --output ./batch-b3it-report.json \
  --html-output ./batch-b3it-report.html \
  --detection-repeats 5 \
  --min-stable-count 2 \
  --min-stable-ratio 0.35 \
  --confirm-passes 1

LT-lite baseline

python "$APIQ" lt-baseline --config ./provider.json --output ./lt-baseline.json

LT-lite detect

python "$APIQ" lt-detect \
  --config ./provider.json \
  --baseline ./lt-baseline.json \
  --output ./lt-report.json

B3IT-lite baseline

python "$APIQ" b3it-baseline --config ./provider.json --output ./b3it-baseline.json

B3IT-lite detect

python "$APIQ" b3it-detect \
  --config ./provider.json \
  --baseline ./b3it-baseline.json \
  --output ./b3it-report.json \
  --detection-repeats 5 \
  --min-stable-count 2 \
  --min-stable-ratio 0.35 \
  --confirm-passes 1

Daily single-endpoint drift run

bash "$APIQ_DAILY" ./provider.json ./daily-out my-endpoint

Defaults and guardrails

Default to non-streaming, timeout=60, and temperature values matched to the detector.
Every command can additionally write a human-readable report with --html-output.
OpenAI/OpenAI-compatible requests may include custom JSON headers, either from the config file or --headers-json.
The Kimi-specific {"User-Agent":"KimiCLI/2.0.0"} header is not a general default. Use it only for https://api.kimi.com/coding endpoints; for the OpenAI-compatible Kimi path, use https://api.kimi.com/coding/v1.
The script auto-disables thinking for common reasoning-first providers such as Ark, Doubao, GLM, and Zhipu unless extra_body is explicitly provided in the config JSON.
For OpenAI/OpenAI-compatible endpoints that still return reasoning_content without normal text, the script will retry once with {"thinking":{"type":"disabled"}} before failing.
OpenAI/OpenAI-compatible configs may use either the API root or a full .../chat/completions URL; the script normalizes the base URL internally.
init-config writes the normalized config explicitly, including any auto-selected extra_body, so the saved file is portable across Codex, Claude Code, Gemini CLI, and OpenClaw runs.
init-batch-config does the same normalization for a whole config list and writes a ready-to-run providers.json.
Anthropic mode is treated as B3IT-only in this skill.
If the endpoint returns reasoning/thinking blocks but no normal text, report that as a capability mismatch instead of fabricating a token result.
If the endpoint does not return logprobs, report that LT-lite is unavailable instead of guessing.
B3IT baseline discovery on OpenAI/OpenAI-compatible endpoints will automatically retry with a stronger candidate-search profile if an initial low-cost run finds no border inputs.
B3IT detection defaults are tuned for lower false positives: detection-repeats=5, min-stable-count=2, min-stable-ratio=0.35, confirm-passes=1.
Daily runs auto-refresh older B3IT baselines when they predate the stable-support filtering fields.

Resources

Open only what you need:

Workflow examples and CLI usage: references/workflows.md
Protocol-first playbook for OpenAI, OpenAI-Compatible, and Anthropic: references/endpoint-types-playbook.md
Kimi coding endpoint quickstart: references/kimi-coding-quickstart.md
Kimi Anthropic endpoint quickstart: references/kimi-anthropic-quickstart.md
JSON config format: references/config-schema.md
Example provider list: references/providers.example.json
Main executable: scripts/api_quality_check.py
One-shot batch wrapper: scripts/run_batch_checks.sh
Daily single-endpoint wrapper: scripts/run_daily_check.sh

Comments

Loading comments...