chichi-speech (local text-to-speech service with Qwen3-TTS model)

PassAudited by ClawScan on May 10, 2026.

Overview

This appears to be a coherent local text-to-speech service, but users should be aware it installs external ML dependencies/models and the server can bind to all network interfaces if started with its code defaults.

This skill looks consistent with its stated purpose. Install it in a virtual environment, expect downloads of external packages/models, and start the service with `--host 127.0.0.1` unless you intentionally want other machines to access it. Only use voice reference audio that you have permission to clone.

Findings (3)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

NoteHigh Confidence

ASI02: Tool Misuse and Exploitation

What this means

Other devices on the same reachable network might be able to request speech generation and consume local compute if the service is exposed.

Why it was flagged

The API endpoint is unauthenticated and the code default binds to all interfaces, so starting the CLI without an explicit localhost host can expose the TTS service beyond the local machine.

Skill content

parser.add_argument("--host", type=str, default="0.0.0.0", help="Service host (default: 0.0.0.0)") ... @app.post("/synthesize")

Recommendation

Run it with `--host 127.0.0.1` unless network access is intentional, and use firewalling or authentication if exposing it beyond localhost.

NoteMedium Confidence

ASI04: Agentic Supply Chain Vulnerabilities

What this means

Installing the skill may fetch current package versions from package repositories, so behavior can vary over time and depends on the trustworthiness of those packages.

Why it was flagged

The install relies on multiple external Python packages, mostly without pinned versions. This is normal for a Python ML service, but it leaves exact dependency versions and provenance to the installer environment.

Skill content

dependencies = [
    "fastapi",
    "uvicorn",
    "requests",
    "torch",
    "soundfile",
    "pydantic",
    "qwen-tts",
    "numba>=0.59.0",
]

Recommendation

Install in a virtual environment and consider pinning or reviewing dependency versions, especially `qwen-tts`, `torch`, and related ML packages.

NoteHigh Confidence

ASI04: Agentic Supply Chain Vulnerabilities

What this means

The first run may download or load external model assets, which can be large and whose contents are outside this artifact review.

Why it was flagged

The service loads a pretrained model from an external model identifier at startup. This is purpose-aligned for TTS, but it is an external artifact that is not included in the reviewed files.

Skill content

model = Qwen3TTSModel.from_pretrained(
        "Qwen/Qwen3-TTS-12Hz-1.7B-Base",

Recommendation

Use trusted model sources, verify model/package provenance where possible, and run in an isolated environment if concerned.