skills coach

Security checks across malware telemetry and agentic risk

Overview

This skill is a real skill-optimization tool, but it can run commands, install software, call external AI services, and automatically change or delete generated skill files with limited approval gates.

Review carefully before installing. Use it only in a disposable workspace, VM, or container for third-party skills. Disable auto_install_deps and auto_fix unless you explicitly want package installs and automated file edits, and avoid running it on skills or workspaces containing secrets or content you would not send to external AI services.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (81)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
cmd.append('--auto-install')
        print("  → Auto-install dependencies enabled")

    result = subprocess.run(cmd, cwd=exec_agent_dir)

    if result.returncode != 0:
        print("ERROR: exec-agent failed")
Confidence
82% confidence
Finding
result = subprocess.run(cmd, cwd=exec_agent_dir)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
try:
            # Try to use Claude API for evaluation
            result = subprocess.run(
                ['claude', '--model', 'sonnet-4'],
                input=prompt,
                capture_output=True,
Confidence
90% confidence
Finding
result = subprocess.run( ['claude', '--model', 'sonnet-4'], input=prompt, capture_output=True, text=True, ti

subprocess module call

Medium
Category
Dangerous Code Execution
Content
print(f"  Command: {install_cmd}")

        try:
            result = subprocess.run(
                install_cmd,
                shell=True,
                capture_output=True,
Confidence
98% confidence
Finding
result = subprocess.run( install_cmd, shell=True, capture_output=True, text=True, timeout=300 # 5 minute ti

subprocess module call

Medium
Category
Dangerous Code Execution
Content
for cmd, strategy_name in strategies:
            try:
                result = subprocess.run(
                    cmd,
                    capture_output=True,
                    text=True,
Confidence
86% confidence
Finding
result = subprocess.run( cmd, capture_output=True, text=True, timeout=300 # 5 minute timeout

subprocess module call

Medium
Category
Dangerous Code Execution
Content
# Execute command
        start_time = time.time()
        try:
            result = subprocess.run(
                command,
                shell=True,
                capture_output=True,
Confidence
98% confidence
Finding
result = subprocess.run( command, shell=True, capture_output=True, text=True, timeout=300, # 5 minute timeo

subprocess module call

Medium
Category
Dangerous Code Execution
Content
f.write(skill_content)

            try:
                result = subprocess.run(
                    command,
                    shell=True,
                    capture_output=True,
Confidence
98% confidence
Finding
result = subprocess.run( command, shell=True, capture_output=True, text=True, timeou

subprocess module call

Medium
Category
Dangerous Code Execution
Content
return

            print("✓ Running command optimizer...")
            result = subprocess.run(
                [sys.executable, str(optimizer_path), str(self.output_dir), str(self.output_dir.parent)],
                capture_output=True,
                text=True,
Confidence
88% confidence
Finding
result = subprocess.run( [sys.executable, str(optimizer_path), str(self.output_dir), str(self.output_dir.parent)], capture_output=True, text

Lp3

Medium
Category
MCP Least Privilege
Confidence
96% confidence
Finding
The skill declares no permissions despite describing behavior that reads and writes files, executes shell commands, accesses environment data, and uses external networked services. This undermines informed consent and policy enforcement because users and supervising systems cannot accurately assess the operational risk before invocation.

Tp4

High
Category
MCP Tool Poisoning
Confidence
98% confidence
Finding
The documented purpose frames the skill as analysis and reporting, but the described workflow goes much further: executing generated tasks, modifying copied code, changing requirements, installing dependencies, calling external APIs, and potentially downloading resources. That mismatch is dangerous because users may authorize what sounds like passive evaluation while actually granting a self-modifying, networked execution pipeline over arbitrary target skills.

Description-Behavior Mismatch

Medium
Confidence
96% confidence
Finding
The CLI configures `output_dir=str(target_skill_path)`, which causes optimization artifacts or rewritten skill content to be written directly into the target skill directory rather than an isolated workspace. Because this tool is supposed to analyze and optimize another skill, in-place writes can unintentionally modify trusted source files, corrupt the target skill, or persist unsafe changes produced from untrusted task inputs.

Context-Inappropriate Capability

Medium
Confidence
91% confidence
Finding
The orchestrator's ability to execute multiple subskills and optionally auto-install dependencies exceeds a narrow coaching/reporting role and creates a broad code-execution surface. In this context the tool processes untrusted target skills and generated tasks, so package installation and execution can be leveraged to run attacker-controlled code or persist malicious dependencies.

Intent-Code Divergence

High
Confidence
94% confidence
Finding
The function advertises execution of both original and optimized skills, but only passes the original skill path to the executor. This undermines the integrity of the evaluation pipeline: an attacker or faulty optimization could evade scrutiny because reported optimized results may not correspond to the optimized artifact at all.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
The skill instructs the evaluator to delete the optimized skill directory as part of an evaluation decision, even though the subskill's stated purpose is assessment and reporting. This mixes analysis with destructive state-changing behavior, creating a risk of unintended data loss if the retention logic, paths, or inputs are wrong or manipulated.

Context-Inappropriate Capability

Medium
Confidence
94% confidence
Finding
The evaluator is expected to score skills, but it also invokes an external LLM to judge documentation quality. That adds hidden model-execution capability and sends untrusted skill content outside the local evaluator, which is risky in a security-sensitive analysis pipeline because the skill text is adversarial by instruction.

Description-Behavior Mismatch

Medium
Confidence
96% confidence
Finding
The evaluator does not merely report a retention decision; it deletes the optimized skill directory with `shutil.rmtree`. In an automation context, destructive filesystem actions are dangerous because a bad path, manipulated output directory, or mistaken evaluation can irreversibly remove artifacts beyond what a user may expect from an evaluator.

Description-Behavior Mismatch

High
Confidence
99% confidence
Finding
The file can modify the host by installing commands and Python packages, which exceeds a checker/reporting role and creates unnecessary capability to alter the execution environment. In a skill whose stated purpose is coaching and optimization, this mismatch makes the behavior more concerning because it grants persistence and package-management powers not clearly required for the task.

Context-Inappropriate Capability

High
Confidence
98% confidence
Finding
Embedding host package-management and system installation logic in this skill introduces powerful side effects unrelated to its declared coaching purpose. This broadens the attack surface substantially, especially because some installers invoke external tools or shell commands that can change the machine state.

Intent-Code Divergence

Medium
Confidence
88% confidence
Finding
The docstring claims auto-installation occurs with user consent, but this file contains no consent prompt at the moment of installation; enabling `auto_install` is enough to proceed. That discrepancy can mislead users and reviewers about the actual safety model, making unexpected host modification more likely.

Description-Behavior Mismatch

High
Confidence
97% confidence
Finding
This file gives the skill unrestricted command-execution capability and pairs it with optional dependency installation, which materially exceeds the stated coach/analysis/optimization/reporting purpose. That scope mismatch is dangerous because users may invoke the skill expecting analysis, while the implementation can perform real host actions that alter the system or run attacker-controlled payloads.

Context-Inappropriate Capability

High
Confidence
97% confidence
Finding
Shell-based execution is not justified by the described coaching/optimization role and creates a broad attack surface for command injection and arbitrary system interaction. In this context, the capability is more dangerous because the skill processes task content and directly turns extracted text into executable shell commands.

Context-Inappropriate Capability

Medium
Confidence
93% confidence
Finding
Automatic installation of dependencies modifies the runtime environment and can introduce unreviewed code, persistence, or supply-chain risk. For a skill described as analysis/optimization/reporting, silently changing the host environment is unjustified and increases the chance of unintended package installation or execution of malicious install scripts.

Context-Inappropriate Capability

Medium
Confidence
91% confidence
Finding
The code silently reads ANTHROPIC_API_KEY from the environment and enables outbound transmission of analyzed content to a remote LLM service. In this file, that means local script contents and failure data may be sent off-host without explicit user consent, which exceeds the stated analysis/reporting role and creates confidentiality and governance risk.

Description-Behavior Mismatch

Medium
Confidence
92% confidence
Finding
The auto-fixer directly edits requirements.txt and SKILL.md, changing project behavior and documentation rather than only analyzing or reporting. Automatic mutation of skill files can introduce dependency confusion, inaccurate documentation, or unauthorized changes in environments expecting read-only analysis tools.

Description-Behavior Mismatch

High
Confidence
97% confidence
Finding
This block sends script contents to an LLM, accepts generated replacement code, validates only syntax, and then overwrites Python files in place. That creates a high-risk code-integrity issue because untrusted model output can alter program behavior, introduce security flaws, or embed malicious logic without review.

Context-Inappropriate Capability

High
Confidence
99% confidence
Finding
The optimizer executes arbitrary shell commands extracted from SKILL.md and later executes LLM-generated replacements, both with shell=True. Because SKILL.md content and LLM output are untrusted inputs, this creates direct command-execution risk, enabling file destruction, credential access, network exfiltration, or lateral movement on the host running the optimizer.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal