VoiceTrust

v0.4.1

Interpret VoiceTrust results for owner verification on voice/audio inputs. Use when you need the meaning of VoiceTrust fields, trust labels, command-gating d...

1· 136·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for kunyancai/voicetrust.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "VoiceTrust" (kunyancai/voicetrust) from ClawHub.
Skill page: https://clawhub.ai/kunyancai/voicetrust
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install voicetrust

ClawHub CLI

Package manager switcher

npx clawhub@latest install voicetrust
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the implementation: code implements speaker verification, audio quality, and decision gating. Requested dependencies (torch/speechbrain) and local model files are appropriate for a local speaker-verification system. No unrelated credentials, binaries, or config paths are requested.
Instruction Scope
SKILL.md limits runtime actions to running STT, running VoiceTrust, and local bootstrap via references/quickstart.md. The included CLI/demo scripts and pipeline operate on local audio files and local owner enrollment state; they do not instruct reading unrelated system files or exfiltrating data. The skill explicitly warns owner enrollment data should remain local.
Install Mechanism
The bundle is instruction-only (no platform install spec), but includes scripts that download large model checkpoint files at runtime from a GitHub raw mirror (raw.githubusercontent.com/ChuaKhunngan/VoiceTrust/...). This is a reasonable approach for model-heavy tools, and the downloader includes hard-coded SHA-256 checks for each file (good). Still, downloading binary checkpoints from a personal mirror has supply-chain risk—users should verify the canonical repo, the raw URLs, and checksums before running downloads. The use of GitHub raw is common, but the mirror is not an official upstream release host (the script documents the SpeechBrain upstream).
Credentials
The skill declares no required environment variables or credentials and operates on local filesystem paths under the skill runtime (assets/, data/). The README mentions resolving FFMPEG_BIN if present, which is a reasonable runtime hint; no unexplained SECRET/TOKEN/PASSWORD requirements are present.
Persistence & Privilege
always is false and the skill does not request persistent platform-level privileges. It stores local owner enrollment and voiceprints under runtime/data/owners and runtime/data/voiceprints (expected for this purpose). It does not modify other skills or system-wide agent config.
Assessment
This skill appears to do what it claims: local speaker verification using SpeechBrain models. Before installing/running: 1) Inspect and verify the raw model URLs and the hard-coded SHA-256 checksums in scripts/ensure_models.py (the script will fail if checksums don't match). Prefer obtaining model checkpoints from the official upstream (SpeechBrain / HuggingFace) if you want stronger provenance guarantees. 2) Be aware the runtime will write enrollment/voiceprint files to runtime/data/owners and runtime/data/voiceprints — do not share those files if you want the enrolled identity private. 3) The runtime requires heavy ML deps (torch, speechbrain, etc.) and optionally ffmpeg; expect a large install and CPU/GPU resource use. 4) If you are concerned about network supply-chain risk, run ensure_models.py only after manually auditing the URLs or by populating the assets directory with checkpoints you obtained from an upstream release. Otherwise the skill is internally consistent and proportionate for its stated purpose.

Like a lobster shell, security has layers — review code before you run it.

latestvk97d8h6rttkhvka6nh2mdzdjg983jxkp
136downloads
1stars
3versions
Updated 1mo ago
v0.4.1
MIT-0

VoiceTrust

VoiceTrust answers one question: is this audio likely spoken by the enrolled owner?

Normal use:

  • run STT for content
  • run VoiceTrust for owner verification
  • merge both before replying

Do not use this skill to define machine-specific commands. Local routing and machine policy belong elsewhere.

Runtime note

This skill bundle is lightweight:

  • source code and setup docs are included
  • large model files are not bundled
  • owner enrollment data is local runtime state and must not be published

If model assets are missing, read references/quickstart.md.

Output fields

VoiceTrust results may include:

  • speaker_match
  • audio_quality
  • overall_trust
  • confidence
  • identity_score
  • trust_label
  • decision
  • decision_reasons
  • speaker_id
  • speech_duration
  • speech_ratio
  • vad_status
  • failure_reason
  • raw_scores.speaker_similarity

How to use the result

Use trust_label for concise rendering. Use decision for command gating. Do not treat audio quality alone as owner identity evidence.

Trust label

  • high: identity_score >= 85 and confidence >= 80 and no failure
  • medium: identity_score >= 72 and confidence >= 68 and no failure
  • low: everything else

Common downgrade signals:

  • vad_status != "ok"
  • speech_duration < 2.5
  • speech_ratio < 0.45
  • speaker_match < 70
  • failure_reason != null

Command gating

For voice command execution:

  • use the normal path when speech_duration >= 3.0
  • allow a short voice sample only when all of the following are true:
    • speech_duration >= 1.2
    • speaker_match >= 85
    • confidence >= 85
  • in all cases, command execution still requires:
    • speaker_match >= 78
    • confidence >= 80
    • identity_score >= 82
    • vad_status == "ok"
    • failure_reason == null

Interpretation:

  • decision == "allow_command" means command execution may proceed
  • decision != "allow_command" means do not execute commands from this sample
  • non-command voice content may still be handled normally
  • music / non-speech / non-command audio should not enter the command path

CLI example:

uv run --python .venv/bin/python ../scripts/demo.py \
  --audio /path/to/sample.ogg \
  --speaker owner \
  --json

Human rendering

Preferred compact format:

  • Voice trust: high / medium / low
  • Details: match <x> - trust <y> - confidence <z> - identity <i> - quality <q>
  • if relevant: Decision: allow_command / reject_command

If degraded, say why briefly using decision_reasons. Do not over-claim certainty.

Failure handling

  • If STT succeeds and VoiceTrust fails: keep transcript, report trust as unavailable or inconclusive.
  • If VoiceTrust succeeds and STT fails: keep trust result, report transcription failure.
  • If both fail: say the audio could not be processed reliably.
  • If decision != "allow_command", do not execute voice commands.

First-time setup

For first-time setup, local installation, enrollment, or bootstrap, read:

  • references/quickstart.md

Comments

Loading comments...