chichi-speech (local text-to-speech service with Qwen3-TTS model)
PassAudited by VirusTotal on May 12, 2026.
Overview
Type: OpenClaw Skill Name: chichi-speech Version: 1.0.2 The skill bundle provides a FastAPI-based text-to-speech service using the Qwen3 model. All files align with the stated purpose, including loading a pre-trained model and a reference audio file from legitimate public URLs (qianwen-res.oss-cn-beijing.aliyuncs.com). The `SKILL.md` instructions are clear and do not contain any prompt injection attempts. While the server defaults to listening on `0.0.0.0` in `src/chichi_speech/server.py` and allows specifying arbitrary `--ref-audio` URLs, these are common practices for web services and core features for voice cloning, respectively, and do not indicate intentional malicious behavior like data exfiltration or unauthorized execution.
Findings (0)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Other devices on the same reachable network might be able to request speech generation and consume local compute if the service is exposed.
The API endpoint is unauthenticated and the code default binds to all interfaces, so starting the CLI without an explicit localhost host can expose the TTS service beyond the local machine.
parser.add_argument("--host", type=str, default="0.0.0.0", help="Service host (default: 0.0.0.0)") ... @app.post("/synthesize")Run it with `--host 127.0.0.1` unless network access is intentional, and use firewalling or authentication if exposing it beyond localhost.
Installing the skill may fetch current package versions from package repositories, so behavior can vary over time and depends on the trustworthiness of those packages.
The install relies on multiple external Python packages, mostly without pinned versions. This is normal for a Python ML service, but it leaves exact dependency versions and provenance to the installer environment.
dependencies = [
"fastapi",
"uvicorn",
"requests",
"torch",
"soundfile",
"pydantic",
"qwen-tts",
"numba>=0.59.0",
]Install in a virtual environment and consider pinning or reviewing dependency versions, especially `qwen-tts`, `torch`, and related ML packages.
The first run may download or load external model assets, which can be large and whose contents are outside this artifact review.
The service loads a pretrained model from an external model identifier at startup. This is purpose-aligned for TTS, but it is an external artifact that is not included in the reviewed files.
model = Qwen3TTSModel.from_pretrained(
"Qwen/Qwen3-TTS-12Hz-1.7B-Base",Use trusted model sources, verify model/package provenance where possible, and run in an isolated environment if concerned.
