VoiceTrust

v0.4.1

Interpret VoiceTrust results for owner verification on voice/audio inputs. Use when you need the meaning of VoiceTrust fields, trust labels, command-gating d...

⭐ 1· 136·0 current·0 all-time

by@kunyancai

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for kunyancai/voicetrust.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "VoiceTrust" (kunyancai/voicetrust) from ClawHub.
Skill page: https://clawhub.ai/kunyancai/voicetrust
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install voicetrust

ClawHub CLI

Package manager switcher

npx clawhub@latest install voicetrust

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description match the implementation: code implements speaker verification, audio quality, and decision gating. Requested dependencies (torch/speechbrain) and local model files are appropriate for a local speaker-verification system. No unrelated credentials, binaries, or config paths are requested.

✓

Instruction Scope

SKILL.md limits runtime actions to running STT, running VoiceTrust, and local bootstrap via references/quickstart.md. The included CLI/demo scripts and pipeline operate on local audio files and local owner enrollment state; they do not instruct reading unrelated system files or exfiltrating data. The skill explicitly warns owner enrollment data should remain local.

ℹ

Install Mechanism

The bundle is instruction-only (no platform install spec), but includes scripts that download large model checkpoint files at runtime from a GitHub raw mirror (raw.githubusercontent.com/ChuaKhunngan/VoiceTrust/...). This is a reasonable approach for model-heavy tools, and the downloader includes hard-coded SHA-256 checks for each file (good). Still, downloading binary checkpoints from a personal mirror has supply-chain risk—users should verify the canonical repo, the raw URLs, and checksums before running downloads. The use of GitHub raw is common, but the mirror is not an official upstream release host (the script documents the SpeechBrain upstream).

✓

Credentials

The skill declares no required environment variables or credentials and operates on local filesystem paths under the skill runtime (assets/, data/). The README mentions resolving FFMPEG_BIN if present, which is a reasonable runtime hint; no unexplained SECRET/TOKEN/PASSWORD requirements are present.

✓

Persistence & Privilege

always is false and the skill does not request persistent platform-level privileges. It stores local owner enrollment and voiceprints under runtime/data/owners and runtime/data/voiceprints (expected for this purpose). It does not modify other skills or system-wide agent config.

Assessment

This skill appears to do what it claims: local speaker verification using SpeechBrain models. Before installing/running: 1) Inspect and verify the raw model URLs and the hard-coded SHA-256 checksums in scripts/ensure_models.py (the script will fail if checksums don't match). Prefer obtaining model checkpoints from the official upstream (SpeechBrain / HuggingFace) if you want stronger provenance guarantees. 2) Be aware the runtime will write enrollment/voiceprint files to runtime/data/owners and runtime/data/voiceprints — do not share those files if you want the enrolled identity private. 3) The runtime requires heavy ML deps (torch, speechbrain, etc.) and optionally ffmpeg; expect a large install and CPU/GPU resource use. 4) If you are concerned about network supply-chain risk, run ensure_models.py only after manually auditing the URLs or by populating the assets directory with checkpoints you obtained from an upstream release. Otherwise the skill is internally consistent and proportionate for its stated purpose.

Like a lobster shell, security has layers — review code before you run it.

latestvk97d8h6rttkhvka6nh2mdzdjg983jxkp

136downloads

1stars

3versions

Updated 1mo ago

v0.4.1

MIT-0

VoiceTrust

VoiceTrust answers one question: is this audio likely spoken by the enrolled owner?

Normal use:

run STT for content
run VoiceTrust for owner verification
merge both before replying

Do not use this skill to define machine-specific commands. Local routing and machine policy belong elsewhere.

Runtime note

This skill bundle is lightweight:

source code and setup docs are included
large model files are not bundled
owner enrollment data is local runtime state and must not be published

If model assets are missing, read references/quickstart.md.

Output fields

VoiceTrust results may include:

speaker_match
audio_quality
overall_trust
confidence
identity_score
trust_label
decision
decision_reasons
speaker_id
speech_duration
speech_ratio
vad_status
failure_reason
raw_scores.speaker_similarity

How to use the result

Use trust_label for concise rendering. Use decision for command gating. Do not treat audio quality alone as owner identity evidence.

Trust label

high: identity_score >= 85 and confidence >= 80 and no failure
medium: identity_score >= 72 and confidence >= 68 and no failure
low: everything else

Common downgrade signals:

vad_status != "ok"
speech_duration < 2.5
speech_ratio < 0.45
speaker_match < 70
failure_reason != null

Command gating

For voice command execution:

use the normal path when speech_duration >= 3.0
allow a short voice sample only when all of the following are true:
- speech_duration >= 1.2
- speaker_match >= 85
- confidence >= 85
in all cases, command execution still requires:
- speaker_match >= 78
- confidence >= 80
- identity_score >= 82
- vad_status == "ok"
- failure_reason == null

Interpretation:

decision == "allow_command" means command execution may proceed
decision != "allow_command" means do not execute commands from this sample
non-command voice content may still be handled normally
music / non-speech / non-command audio should not enter the command path

CLI example:

uv run --python .venv/bin/python ../scripts/demo.py \
  --audio /path/to/sample.ogg \
  --speaker owner \
  --json

Human rendering

Preferred compact format:

Voice trust: high / medium / low
Details: match <x> - trust <y> - confidence <z> - identity <i> - quality <q>
if relevant: Decision: allow_command / reject_command

If degraded, say why briefly using decision_reasons. Do not over-claim certainty.

Failure handling

If STT succeeds and VoiceTrust fails: keep transcript, report trust as unavailable or inconclusive.
If VoiceTrust succeeds and STT fails: keep trust result, report transcription failure.
If both fail: say the audio could not be processed reliably.
If decision != "allow_command", do not execute voice commands.

VoiceTrust

Install with OpenClaw

CLI Commands

VoiceTrust

Runtime note

Output fields

How to use the result

Trust label

Command gating

Human rendering

Failure handling

First-time setup

Comments