Skillv2.99.92

ClawScan security

Mimo Tts Asr 26 Free · ClawHub's context-aware review of the artifact, metadata, and declared behavior.

Scanner verdict

SuspiciousApr 24, 2026, 4:49 AM

Verdict: suspicious
Confidence: high
Model: gpt-5-mini
Summary: The skill mostly matches a TTS/ASR utility but its declared requirements (none) do not match the actual instructions and scripts (which require API keys and ffmpeg), so there is a meaningful inconsistency you should review before installing.
Guidance: This skill appears to implement the advertised TTS/ASR features, but there are a few things to check before installing: - Metadata mismatch: the registry lists no required env vars, but SKILL.md and the scripts require MIMO_API_KEY and/or MIMO_ASR_KEY (ASR will abort if no key). Confirm whether you must provide API keys. - External network use: audio and reference samples will be uploaded to api.xiaomimimo.com (or a custom MIMO_API_ENDPOINT). If you have privacy concerns, do not upload sensitive audio or reference voice samples. - ASR vs. fallback: TTS can fallback to local edge-tts when no key is present, but ASR appears to require a key — the docs are ambiguous about this. Expect ASR to be cloud-only unless you run your own ASR models locally. - Dependencies and system calls: the scripts invoke ffmpeg and use subprocess; install ffmpeg in a controlled environment and audit subprocess calls if you will run on sensitive systems. - Source trust: the skill's source and owner are unknown and there's no homepage. If you don't trust the author, run it in a sandboxed environment or review the full scripts (they are included) before use. Recommended actions: verify the author/source, confirm whether you need to supply API keys, review the scripts (they are included) for any unexpected network calls, and run the skill in an isolated container or VM if you will process private audio or use voice cloning.

Review Dimensions

Purpose & Capability: concernThe SKILL.md and included scripts clearly implement TTS and ASR and call a remote API (api.xiaomimimo.com) and/or local edge-tts. However the registry metadata listed no required environment variables or credentials, while the documentation and code require MIMO_API_KEY and/or MIMO_ASR_KEY (ASR script will exit if no key). This mismatch between declared requirements and actual needs is incoherent and surprising.
Instruction Scope: noteThe runtime instructions and scripts stay within TTS/ASR functionality (preprocess audio, call remote API, fallback to edge-tts, caching, chunking). They do perform file I/O (read/write audio, create cache dir), spawn ffmpeg via subprocess, and upload audio to a remote API endpoint — all expected for this purpose. Minor inconsistency: SKILL.md states 'no Key also can use' (edge-tts fallback) but ASR script enforces a key; the doc's wording around ASR availability is ambiguous.
Install Mechanism: okThere is no install spec that downloads arbitrary archives; the package is instruction-only with included Python scripts. Dependencies are installed via pip per the README (openai, edge-tts). No remote download URLs or extraction steps are present in the install metadata.
Credentials: concernUsing an API key (MIMO_API_KEY / MIMO_ASR_KEY) is proportionate to contacting a cloud TTS/ASR service. The concern is that the registry metadata omitted required env vars while SKILL.md and the scripts require them. Also the skill exposes an API endpoint base_url that points to https://api.xiaomimimo.com/v1 — you should expect audio and possibly voice-clone data (reference audio) to be uploaded to that external service; this is expected for cloud TTS but relevant for privacy.
Persistence & Privilege: okThe skill does not request always: true and does not modify other skills' configs. It creates a local cache directory inside the skill workspace and uses /tmp for temporary files — normal for this kind of tool.