faster-whisper 中文版 - 高性能本地语音转文字工具

v1.0.5

使用 faster-whisper 的本地语音转文字工具。支持 GPU 加速的高性能转录，包含词级时间戳和蒸馏模型。当用户要求"转录音频"、"语音转文字"或"whisper"时使用此技能。

⭐ 1· 116·0 current·0 all-time

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for mapleshadow/faster-whisper-zh.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "faster-whisper 中文版 - 高性能本地语音转文字工具" (mapleshadow/faster-whisper-zh) from ClawHub.
Skill page: https://clawhub.ai/mapleshadow/faster-whisper-zh
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required binaries: ffmpeg, python3
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install faster-whisper-zh

ClawHub CLI

Package manager switcher

npx clawhub@latest install faster-whisper-zh

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Suspicious

medium confidence

✓

Purpose & Capability

Name/description, required binaries (ffmpeg, python3), included scripts (transcribe.py, batch_transcribe.sh, setup.sh) and requirements (faster-whisper, torch) line up with a local transcription tool using faster-whisper; nothing required appears unrelated to the stated purpose.

ℹ

Instruction Scope

Runtime instructions are focused on installing and running a local transcription pipeline. They instruct running setup.sh, creating a venv, and running the provided Python scripts. The scripts set HF_HOME and HF_ENDPOINT environment variables and perform network downloads to fetch models/packages. They do not read unrelated system credentials or arbitrary files, but they do set HF_ENDPOINT to an external mirror in examples and the batch script.

Install Mechanism

There is no registry install spec but setup.sh creates a venv and installs packages via pip (requirements.txt and conditional torch installs via the official PyTorch wheel index). However, examples and scripts export HF_ENDPOINT=https://hf-mirror.com and batch_transcribe.sh unconditionally exports that mirror — directing model downloads to a third-party endpoint is high-risk if the mirror is untrusted. Overall install approach (pip in venv) is expected, but the mirror usage is the main concern.

Credentials

The skill declares no required credentials and does not request secrets. However, it repeatedly sets HF_HOME (/config/huggingface) and HF_ENDPOINT (https://hf-mirror.com) in docs and scripts. HF_HOME is benign (a cache path) but hard-coded /config/huggingface may be outside a user's expectations. HF_ENDPOINT pointing to an unknown third-party mirror is disproportionate and could redirect model downloads to an attacker-controlled host.

✓

Persistence & Privilege

always is false and the skill does not request persistent platform-wide privileges or modify other skills. It only creates a virtualenv and makes local scripts executable — standard behavior for a packaged CLI tool.

What to consider before installing

This skill is consistent with a local faster-whisper transcription tool, but exercise caution before running its install or batch scripts. Key points: - The main risk is HF_ENDPOINT=https://hf-mirror.com: the skill's scripts and examples set this third-party mirror as the model/package endpoint. That will make your machine download model files from this unknown host. Verify the mirror's trustworthiness before using it. Prefer the official Hugging Face endpoints (or your own trusted mirror), or remove/override HF_ENDPOINT. - Review setup.sh and transcribe.py before running. They install packages with pip and may download large binaries (models, PyTorch). Run in an isolated environment (VM or container) and ensure you have sufficient disk space. - The scripts hard-code HF_HOME=/config/huggingface in examples and batch_transcribe.sh; change this to a safe local path if needed to avoid writing to unexpected locations. - If you proceed: run ./setup.sh only after inspecting it, run pip installs inside the created .venv, and monitor network traffic (or pre-download models from known sources). If you cannot verify hf-mirror.com, remove those exports or replace them with official endpoints (e.g., unset HF_ENDPOINT so downloads use defaults). - Because the source and homepage are unknown, prefer running this tool in an isolated environment until you confirm the mirror and packages are trustworthy.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

Binsffmpeg, python3

latestvk976vdavpges996674wf9b8ra5850ktd

116downloads

1stars

5versions

Updated 1w ago

v1.0.5

MIT-0

Faster-Whisper 中文版

基于 faster-whisper 的高性能本地语音转文字工具。

安装设置

1. 运行安装脚本

执行安装脚本以创建虚拟环境并安装依赖包。脚本会自动检测 NVIDIA GPU 以启用 CUDA 加速。

./setup.sh

系统要求：

Python 3.10 或更高版本
ffmpeg（系统已安装）

使用方法

使用转换脚本转换音频文件。

适用场景

会议录音转文字纪要
语音笔记转文字记录
音频文件内容提取
访谈录音整理
培训录音转文字材料
视频字幕生成
播客内容转录
语音转文字
音频转文字

基本转录

export HF_HOME=/config/huggingface
export HF_ENDPOINT=https://hf-mirror.com
.venv/bin/python3 scripts/transcribe.py

高级选项

指定模型: .venv/bin/python3 scripts/transcribe.py audio.mp3 --model large-v3-turbo
词级时间戳: .venv/bin/python3 scripts/transcribe.py audio.mp3 --word-timestamps
JSON 输出: .venv/bin/python3 scripts/transcribe.py audio.mp3 --json
语音活动检测（静音去除）: .venv/bin/python3 scripts/transcribe.py audio.mp3 --vad
指定语言: .venv/bin/python3 scripts/transcribe.py audio.mp3 --language zh
GPU 加速: .venv/bin/python3 scripts/transcribe.py audio.mp3 --device cuda
CPU 优化: .venv/bin/python3 scripts/transcribe.py audio.mp3 --device cpu --compute-type int8

完整命令示例

# 中文转录，使用 GPU 加速
.venv/bin/python3 scripts/transcribe.py 会议录音.mp3 --language zh --device cuda --compute-type float16

# 英文转录，包含词级时间戳
.venv/bin/python3 scripts/transcribe.py interview.wav --language en --word-timestamps --json

# 快速 CPU 转录，优化性能
.venv/bin/python3 scripts/transcribe.py audio.m4a --device cpu --compute-type int8 --model distil-large-v3

# 批量处理脚本
.venv/bin/python3 scripts/batch_transcribe.sh /path/to/audio/files/

可用模型

large-v3-turbo (默认):推荐用于多语言或最高准确度任务
large-v3: 原始大模型，准确度最高
distil-large-v3: 速度和准确性的最佳平衡
medium: 中等大小，平衡性能
small: 小型模型，速度快
base: 基础模型，资源需求最低
tiny: 微型模型，速度最快
medium.en, small.en: 仅支持英语的更快版本

模型选择指南

模型	大小	推荐用途	硬件要求
`large-v3-turbo`	1.5GB	专业级转录	高性能 GPU
`medium`	1.5GB	平衡性能	普通配置
`distil-large-v3`	756MB	通用中文转录	中等配置
`small`	500MB	快速转录	低配置
`tiny`	150MB	实时转录	最低配置

性能优化

GPU 加速配置

# NVIDIA GPU (CUDA)
.venv/bin/python3 scripts/transcribe.py audio.mp3 --device cuda --compute-type float16

# Apple Silicon (macOS)
.venv/bin/python3 scripts/transcribe.py audio.mp3 --device mps

CPU 优化配置

# 高性能 CPU
.venv/bin/python3 scripts/transcribe.py audio.mp3 --device cpu --compute-type int8 --beam-size 3

# 低资源环境
.venv/bin/python3 scripts/transcribe.py audio.mp3 --device cpu --compute-type int8 --model small --beam-size 1

故障排除

常见问题

未检测到 GPU: 确保 NVIDIA 驱动和 CUDA 正确安装。CPU 转录速度会显著变慢。
内存不足错误: 使用更小的模型（如 small 或 base）或使用 --compute-type int8
模型下载失败: 设置环境变量 HF_ENDPOINT=https://hf-mirror.com 使用国内镜像
音频格式不支持: 使用 ffmpeg 转换音频格式：ffmpeg -i input.m4a output.wav

错误解决方案

CUDA 不可用

# 检查 CUDA 安装
nvidia-smi

# 如果未安装，重新运行安装脚本
./setup.sh

ffmpeg 未找到

# Ubuntu/Debian
sudo apt install ffmpeg

# macOS
brew install ffmpeg

# CentOS/RHEL
sudo yum install ffmpeg

Python 版本过低

# 检查 Python 版本
python3 --version

# 需要 Python 3.10+

环境变量配置

# 设置 HuggingFace 缓存目录（避免重复下载）
export HF_HOME=/config/huggingface

# 使用国内镜像加速下载
export HF_ENDPOINT=https://hf-mirror.com

# 设置 PyTorch CUDA 版本（如有需要）
export CUDA_VISIBLE_DEVICES=0

批量处理

创建 batch_transcribe.sh 脚本进行批量处理：

#!/bin/bash
# 批量转录脚本
for audio_file in *.mp3 *.wav *.m4a; do
    if [ -f "$audio_file" ]; then
        echo "处理: $audio_file"
        ./scripts/transcribe.py "$audio_file" --output "${audio_file%.*}.txt"
    fi
done

输出格式

纯文本输出

[00:00:00.000 --> 00:00:05.000] 欢迎使用 faster-whisper 语音转文字工具。
[00:00:05.000 --> 00:00:10.000] 这是一个高性能的本地转录解决方案。

JSON 输出

{
  "text": "完整的转录文本...",
  "segments": [
    {
      "start": 0.0,
      "end": 5.0,
      "text": "欢迎使用 faster-whisper 语音转文字工具。",
      "words": [
        {"word": "欢迎", "start": 0.0, "end": 0.5},
        {"word": "使用", "start": 0.5, "end": 1.0}
      ]
    }
  ]
}

更新日志

v1.0.5 (2026-04-16)

重新调整命令
原需激活虚拟环境更改为直接执行虚拟环境的python3

技术支持

如有问题，请：

查看本文档的故障排除部分
检查系统要求是否满足
确保网络连接正常（模型下载需要网络）
查看脚本错误信息进行调试

提示: 首次运行会下载所选模型（large-v3-turbo 约 1.5GB）。请确保有足够的磁盘空间和稳定的网络连接。

Comments

Loading comments...