Parakeet Stt

v1.1.0

Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.

1· 2.4k·5 current·7 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name and description (local Parakeet STT) match the runtime instructions: clone the GitHub repo and run the service via Docker or Python/uvicorn. No unrelated credentials or binaries are requested.
Instruction Scope
SKILL.md stays within scope: it describes installation, running a local HTTP/OpenAI-compatible API, example curl and Python usage, and Docker management commands. It does instruct network operations (git clone, docker compose, pip install) which are expected for installing the stated software.
!
Install Mechanism
The registry contains no formal install spec, but the README-like instructions tell the user to clone a GitHub repo and run Docker or pip. Running code and containers pulled from an external repo is the primary risk here — the source is a GitHub repo (homepage provided), which is a normal release host, but the skill will cause arbitrary code to execute on the host if followed without review.
Credentials
The skill declares no required secrets or config paths. The only runtime environment hint is PARAKEET_URL (to point at the local service). There are no unexplained credential requests.
Persistence & Privilege
always is false and the skill is instruction-only (no code installed by the registry). It does not request persistent presence or modify other skills or system agent settings.
Assessment
This skill is coherent for running a local Parakeet STT service, but it tells you to fetch and run third-party code and containers. Before following the instructions: 1) inspect the GitHub repo (Docker Compose file, Docker images referenced, requirements.txt, and startup scripts) to ensure no unexpected network calls, telemetry, or privileged operations; 2) prefer running the Docker compose in an isolated/test environment (or VM) and avoid exposing the service port to the public Internet; 3) verify the provenance of Docker images (use images from trusted registries or build locally from the repo); 4) don't run as an elevated user if avoidable; 5) if you require additional assurance, run static review or use a sandboxed environment. If you only need transcription and cannot audit the repo, consider using a well-vetted packaged distribution instead.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🦜 Clawdis
latestvk97a0dgngk90dqbm43rvtaksgn7zdexm
2.4kdownloads
1stars
2versions
Updated 1mo ago
v1.1.0
MIT-0

Parakeet TDT (Speech-to-Text)

Local transcription using NVIDIA Parakeet TDT 0.6B v3 with ONNX Runtime. Runs on CPU — no GPU required. ~30x faster than realtime.

Installation

# Clone the repo
git clone https://github.com/groxaxo/parakeet-tdt-0.6b-v3-fastapi-openai.git
cd parakeet-tdt-0.6b-v3-fastapi-openai

# Run with Docker (recommended)
docker compose up -d parakeet-cpu

# Or run directly with Python
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 5000

Default port is 5000. Set PARAKEET_URL to override (e.g., http://localhost:5092).

API Endpoint

OpenAI-compatible API at $PARAKEET_URL (default: http://localhost:5000).

Quick Start

# Transcribe audio file (plain text)
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=text"

# Get timestamps and segments
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=verbose_json"

# Generate subtitles (SRT)
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=srt"

Python / OpenAI SDK

import os
from openai import OpenAI

client = OpenAI(
    base_url=os.getenv("PARAKEET_URL", "http://localhost:5000") + "/v1",
    api_key="not-needed"
)

with open("audio.mp3", "rb") as f:
    transcript = client.audio.transcriptions.create(
        model="parakeet-tdt-0.6b-v3",
        file=f,
        response_format="text"
    )
print(transcript)

Response Formats

FormatOutput
textPlain text
json{"text": "..."}
verbose_jsonSegments with timestamps and words
srtSRT subtitles
vttWebVTT subtitles

Supported Languages (25)

English, Spanish, French, German, Italian, Portuguese, Polish, Russian, Ukrainian, Dutch, Swedish, Danish, Finnish, Norwegian, Greek, Czech, Romanian, Hungarian, Bulgarian, Slovak, Croatian, Lithuanian, Latvian, Estonian, Slovenian

Language is auto-detected — no configuration needed.

Web Interface

Open $PARAKEET_URL in a browser for drag-and-drop transcription UI.

Docker Management

# Check status
docker ps --filter "name=parakeet"

# View logs
docker logs -f <container-name>

# Restart
docker compose restart

# Stop
docker compose down

Why Parakeet over Whisper?

  • Speed: ~30x faster than realtime on CPU
  • Accuracy: Comparable to Whisper large-v3
  • Privacy: Runs 100% locally, no cloud calls
  • Compatibility: Drop-in replacement for OpenAI's transcription API

Comments

Loading comments...