Security audit

llama.cpp Benchmark

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed local benchmarking helper that can also build llama.cpp, with no evidence of hidden credential use, exfiltration, persistence, or destructive behavior beyond scoped build cleanup.

Install this if you are comfortable running local shell scripts that may search your home directory and /DATA for llama-bench, clone or update llama.cpp from GitHub, compile it with cmake, and write benchmark outputs. Review the build directory before using update or clean rebuild options.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (2)

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: Although the content is documentation rather than executable code, it instructs use of helper scripts that update and build llama.cpp from GitHub while the skill is framed as a benchmark tool. This hidden expansion of scope can mislead downstream automation or users into approving actions involving network retrieval and local compilation that they did not expect.

Context-Inappropriate Capability

Medium

Confidence: 86% confidence
Finding: The update-from-GitHub capability is not necessary for the core stated purpose of running benchmarks on existing local GGUF models, so it broadens attack surface without clear justification. Any feature that pulls remote code and rebuilds binaries introduces supply-chain and environment-integrity risks, especially if an agent might invoke it automatically under the assumption it is part of normal benchmarking.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.