Hle Reasoning Wrapper

Security checks across malware telemetry and agentic risk

Overview

This small HLE prompt-formatting helper is purpose-aligned and has no network, credential, shell, or broad filesystem behavior, though its optional local answer cache should be disclosed more clearly.

Before installing, be aware that using the cacheAnswer helper can write model answers to a local cache.json file in the skill directory. Avoid caching private benchmark content or sensitive answers unless that local persistence is acceptable, and delete the cache file when no longer needed.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (3)

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The skill’s stated purpose is prompt formatting and output validation, but it also stores question/answer data on disk. That hidden persistence expands the data-handling scope beyond user expectations and can expose benchmark prompts, answers, or sensitive inputs to later disclosure through local file access, reuse, or accidental inclusion in artifacts.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: Persistent local storage is not justified by a simple reasoning-wrapper function, and the code writes benchmark-related content to a local JSON file without any access control or lifecycle management. In this context, cached prompts and answers may contain proprietary evaluation data or sensitive user inputs, making unnecessary retention a confidentiality risk.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The skill writes cached data to disk with no user-facing warning, consent, or configuration, so users may unknowingly persist prompts and model outputs. This is especially risky for benchmark or evaluation workflows, where preserving answers can leak sensitive material, contaminate future runs, or violate data-handling expectations.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal