Interactive Skills Platform Mvp

Security checks across malware telemetry and agentic risk

Overview

This is a planning skill, not an install-time payload, but its recommended MVP would execute uploaded skills while postponing key safety controls.

Treat this as a design draft that needs security review before use. Do not implement the proposed MVP as written; require sandboxed execution, authentication and authorization, upload consent, secret scanning/redaction, retention and deletion controls, per-run permissions, audit logs, and clear warnings before storing, sending, or executing uploaded skills.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (5)

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: The skill metadata presents the capability as producing architecture and planning artifacts, but the body of the design expands into a platform that uploads and executes untrusted SKILL.md content via an agent capable of CLI/Bash actions. This is dangerous because it disguises materially riskier behavior behind a benign planning-oriented description, increasing the chance that reviewers or users underestimate the exposure to remote code execution and data access.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The design explicitly loads uploaded SKILL.md content into an Agent and executes it, while acknowledging that skills may invoke Bash/CLI. Executing arbitrary uploaded skills without confinement is effectively arbitrary code execution on the backend and can lead to host compromise, secret exfiltration, lateral movement, or abuse of connected APIs.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The document claims to avoid overdesign and explicitly defers sandboxing/security, yet still includes execution of uploaded skills in the MVP. In this context, deferring basic containment while enabling untrusted execution makes the MVP materially unsafe, because the omitted controls are prerequisite safety mechanisms rather than optional enhancements.

Missing User Warnings

High

Confidence: 90% confidence
Finding: The design describes a user flow where uploaded skills can be used and executed, but it does not provide any clear user-facing warning that a skill may trigger arbitrary command execution or unsafe side effects. This is dangerous because users and operators may treat skills as harmless content files rather than executable artifacts, leading to unsafe uploads and execution of malicious skills.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The system sends uploaded SKILL.md content to Claude API for analysis, but the design does not disclose this data flow or obtain user consent. This creates privacy and confidentiality risk because uploaded files may contain proprietary prompts, embedded secrets, internal URLs, or operational logic that users did not expect to be transmitted to a third-party model provider.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal