小米 TTS Proxy

PassAudited by ClawScan on May 10, 2026.

Overview

This appears to be a disclosed local Xiaomi TTS proxy, but it uses your Xiaomi API key, runs a localhost service and FFmpeg, and optionally persists via systemd.

Install this only if you want a local Xiaomi TTS proxy. Use a trusted MIMO_TTS_BASE, protect the MIMO_TTS_KEY env file, keep the service bound to localhost, install FFmpeg from a trusted source, and review the missing systemd service unit before enabling auto-start.

Findings (5)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

Requests through the proxy can consume Xiaomi TTS quota and send the submitted text to the configured upstream service.

Why it was flagged

The proxy uses a Xiaomi API key from the environment as the upstream provider credential. This is expected for the stated TTS integration, but it gives the local service authority to use that key.

Skill content
const MIMO_KEY  = process.env.MIMO_TTS_KEY; ... "api-key": MIMO_KEY
Recommendation

Use a dedicated Xiaomi API key if possible, keep MIMO_TTS_BASE pointed only at Xiaomi or a trusted proxy, and rotate the key if the environment file is exposed.

What this means

Other software running on the same machine could call the proxy and spend TTS quota or send text to the upstream provider.

Why it was flagged

The proxy accepts local POST requests without a separate auth check. Binding to 127.0.0.1 limits exposure, but local processes can still trigger TTS calls.

Skill content
if (req.method === "POST" && req.url?.startsWith("/audio/speech")) { handleTTS(req, res); } ... server.listen(PORT, "127.0.0.1", ...)
Recommendation

Keep the service bound to localhost, do not expose the port to a LAN or the internet, and stop the service when it is not needed.

What this means

A local FFmpeg binary will process audio returned by the provider and write temporary files under /tmp/openclaw.

Why it was flagged

The skill runs FFmpeg as an external process for its documented audio format conversion. This is purpose-aligned and uses spawn without shell interpolation.

Skill content
const ff = spawn("ffmpeg", args);
Recommendation

Install FFmpeg from a trusted source and keep it updated; avoid exposing the proxy to untrusted network callers.

What this means

If enabled, the proxy may keep running after installation and continue accepting localhost TTS requests.

Why it was flagged

The documentation recommends an optional persistent background service with auto-start. It is disclosed and fits a local proxy, but it continues running beyond a single task.

Skill content
代理通过 systemd 服务运行,可实现开机自启和自动管理。 ... sudo systemctl enable --now mimo-tts-proxy.service
Recommendation

Enable the systemd service only if you want always-on TTS, and know how to stop or disable it with systemctl.

What this means

If the service unit is obtained from somewhere else, its command, user, and environment handling may differ from what the reviewed code suggests.

Why it was flagged

The instructions reference a service unit file, but the provided file manifest contains only SKILL.md and tts-proxy.mjs, so that privileged persistence artifact is not available for review here.

Skill content
sudo cp mimo-tts-proxy.service /etc/systemd/system/
Recommendation

Do not install an unreviewed service unit from an untrusted source; review or create a minimal unit that runs only the reviewed tts-proxy.mjs with least privilege.