Security audit

TrustLoop Skill Evolver

Security checks across malware telemetry and agentic risk

Overview

This is a transparent workspace-local skill management helper, but autonomous mode can change learned skills without per-change approval.

Install this only if you want a workspace-local system that can create and maintain learned skills. Start in manual mode, review candidate diffs before publishing, avoid letting it capture sensitive project details, and review the optional TrustLoop plugin separately before installing it or enabling autonomous mode.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (7)

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The file claims the workspace root is the only writable scope and says to never inspect or modify global skill directories, but later instructs the agent to read and write under `./skills/`. That contradiction weakens safety boundaries and can cause the agent to treat a broader area of the workspace as mutable than the initial guardrails imply, increasing the chance of unintended modification of existing skills.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: Allowing 'close natural-language equivalents' for sensitive lifecycle actions like approve, publish, rollback, and mode changes creates ambiguous command interpretation. An attacker or accidental user phrasing could be misread as authorization for a destructive or security-relevant action, especially because this skill is always-on and manages persistent workspace state.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The publish and rollback procedures authorize writes, backups, and restores of `SKILL.md` files, including auto-promotion in some modes, without a mandatory user-visible confirmation or warning at the point of modification. In a skill that evolves and publishes other skills, silent file mutation is particularly risky because it can alter trusted automation behavior in the workspace and make review harder.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The template includes a 'Trigger Signals' section but leaves it entirely as free-form placeholders with no requirement for specificity, constraints, or review criteria. In managed skills, ambiguous activation guidance can cause derived skills to trigger too broadly, increasing the chance that an agent applies the workflow in unintended contexts and performs actions outside the operator's expectations.

Autonomous Decision Making

Medium

Category: Excessive Agency
Content: | Mode | What it does | Best for | | --- | --- | --- | | `manual` | Creates candidates and waits for human approval before publishing | Teams that want maximum control | | `assisted` | Auto-approves low-risk updates but still keeps publishing manual | Teams that want less busywork without giving up review | | `autonomous` | Auto-publishes low-risk patches to `main` and low-risk new skills to `canary` | Teams that want fast iteration with strict low-risk boundaries | Default mode: `manual`.
Confidence: 84% confidence
Finding: Auto-approve

Autonomous Decision Making

Medium

Category: Excessive Agency
Content: In `manual`, the draft only becomes a managed skill after approval. In `assisted`, low-risk updates may be auto-approved, but publish still stays manual. In `autonomous`, low-risk patches can publish directly and low-risk new skills can go out as canaries.
Confidence: 87% confidence
Finding: auto-approve

Autonomous Decision Making

Medium

Category: Excessive Agency
Content: - Treat user suggestions as collaboration, not failure. - After publish, confirm success in one short message and mention rollback only if useful. - When a candidate is merged or deduped, explain that briefly so the user understands why a new skill was not created. - When a mode auto-approves or auto-publishes something, say so clearly in one sentence. ## When To Create A Candidate
Confidence: 87% confidence
Finding: auto-approve

VirusTotal

55/55 vendors flagged this skill as clean.

View on VirusTotal