Back to skill

Security audit

Multi-Agent Collaboration System Skills

Security checks across malware telemetry and agentic risk

Overview

This skill is mostly a coherent multi-agent documentation workflow, but it needs review because its initializer has unsafe shell-script input handling and its delegation/cron guidance can trigger agent actions without clear safeguards.

Review and preferably patch scripts/init.sh before running it; use a simple alphanumeric project name if you do run it. Run the initializer from the intended project directory, verify where files will be written, keep secrets out of indexed docs, and only enable sub-agent or cron automation after deciding what context can be shared and how edits will be reviewed or rolled back.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (7)

Vague Triggers

Medium
Confidence
90% confidence
Finding
Tier 1 uses broad natural-language keywords like '整理' and '分类' that commonly appear in ordinary requests, so the router can silently downgrade tasks to a cheaper but weaker model. In a multi-agent workflow, this can cause incorrect handling of tasks that actually require stronger reasoning, increasing the risk of bad edits, missed issues, or unsafe automation decisions.

Vague Triggers

Medium
Confidence
84% confidence
Finding
Tier 2 trigger descriptions include very generic terms such as '分析' and '建议', which overlap with many unrelated prompts and make routing boundaries unclear. This ambiguity can misclassify sensitive or complex requests, leading to inconsistent model selection and reduced review quality in agent-driven operations.

Vague Triggers

Medium
Confidence
82% confidence
Finding
Tier 3 relies on vague terms like '复杂' and '明确要求', which are subjective and not operationally defined. As a result, high-risk tasks may fail to escalate to the strongest model, or benign tasks may be over-escalated, undermining both safety and predictability.

Vague Triggers

Medium
Confidence
94% confidence
Finding
The example routing code performs naive substring matching with no exclusions, confidence checks, or conflict resolution, making it easy for ordinary text or adversarial phrasing to force unintended model selection. In an agent skill that may trigger downstream actions, this weak routing logic can directly affect output quality and safety-sensitive decisions.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The document recommends automated archiving and cron-driven agent actions that read and modify files, but it does not warn about unintended edits, scope creep, or the need for review and backups. In a collaboration skill, unattended scheduled writes increase the chance of silent data loss or corruption if prompts, routing, or model behavior are wrong.

Vague Triggers

Medium
Confidence
90% confidence
Finding
The delegation trigger phrases are very broad and natural-language based, which can cause delegation to occur in situations the user did not fully intend. In an agent skill that automatically routes work to another model, ambiguous triggers increase the risk of accidental task execution, unintended disclosure of workspace contents, or prompt-injection-style invocation through copied text.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The guide states that the system will automatically call `sessions_spawn` to invoke a Gemini Flash agent, but it does not warn users that task content, file contents, or repository context may be transmitted to another agent or provider. In a multi-agent collaboration skill, this omission is dangerous because users may unknowingly delegate sensitive data or trigger actions in a different execution context without informed consent.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal