Gipformer ASR
Security checks across static analysis, malware telemetry, and agentic risk
Overview
This appears to be a coherent Vietnamese speech-to-text tool, with the main considerations being external model/dependency downloads and where audio files are sent for transcription.
Before installing, use a virtual environment, review the PyPI dependencies and Hugging Face model source, keep the server bound to 127.0.0.1 unless you intentionally secure it, and avoid sending sensitive audio to a remote --server URL.
Static analysis
No static analysis findings were reported for this release.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
First run depends on external model artifacts; if the upstream source or dependency environment changes, the local transcription environment may change too.
The server fetches model files from Hugging Face at startup. This is expected for the stated ASR purpose, but it creates a normal third-party model provenance dependency.
paths[key] = hf_hub_download(repo_id=REPO_ID, filename=filename)
Install in an isolated environment, verify the Hugging Face model source, and consider pinning dependency/model versions for reproducible deployments.
If exposed on a network, other reachable clients could submit audio for processing and consume local compute.
The documented server can bind to all network interfaces. The default is local, but using this option broadens who can reach the transcription API.
python serve.py --host 0.0.0.0 --num-threads 8
Keep the server bound to 127.0.0.1 for personal use, or add network access controls if intentionally exposing it.
Processing malicious or malformed media could expose the local ffmpeg installation to parser vulnerabilities.
The server invokes ffmpeg to decode audio. The command arguments are fixed and purpose-aligned, but media parsing is still a local execution surface.
result = subprocess.run(["ffmpeg", "-y", "-i", tmp_path, "-f", "wav", ...], capture_output=True, timeout=120)
Keep ffmpeg updated and avoid exposing the API to untrusted uploaders unless the host is properly isolated.
Private voice recordings or meeting audio could be transmitted to a non-local server if the user changes the server URL.
The client reads the full audio file and sends it to the configured HTTP server URL. The default server is localhost, but the --server option can point elsewhere.
audio_b64 = base64.b64encode(f.read()).decode("ascii") ... requests.post(f"{server_url}/transcribe", json={"audio_b64": audio_b64}, timeout=600)Use the default localhost server for private audio, and only send audio to remote servers you trust.
