AGENTIC AI GOLD STANDARD

Security checks across malware telemetry and agentic risk

Overview

This skill does not show malware, but it makes broad operational claims about always-on self-improving agents and security controls that are not backed by the included implementation.

Review carefully before installing. Treat the self-improvement, security-gate, 24/7 operation, and integration-test claims as unverified marketing unless the publisher supplies real implementation and tests. If you experiment, use an isolated environment, pin dependencies, avoid sensitive data, use least-privilege API keys, and do not grant production tool access or unattended write privileges without explicit approval, logs, and rollback controls.

SkillSpector

By NVIDIA

Vulnerability Patterns

Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (9)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 91% confidence
Finding: The skill metadata advertises advanced self-improving and security-hardened behavior, but the analyzed behavior instead includes dependency installation, filesystem modification, and shell-based setup actions that are not transparently disclosed. This mismatch is dangerous because users may grant elevated trust or permissions based on the claimed security properties while the skill performs broader setup-side effects and only simulates the advertised protections.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The example claims specialists terminate after task completion, but every created agent is retained in the class-level `_registry` indefinitely. In an agent framework, this mismatch can lead to memory growth, stale sensitive task metadata being retained longer than expected, and incorrect operator assumptions about lifecycle and cleanup.

Intent-Code Divergence

Low

Confidence: 86% confidence
Finding: The code advertises '17 dharmic security checks' but only executes five named checks, creating a false sense of security around agent execution safeguards. In security-sensitive agent infrastructure, overstated controls can cause users to trust protections that do not actually exist, weakening operational decision-making and risk assessment.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The docstring and printed messaging describe real self-improvement capabilities, but the implementation only simulates research, testing, and update discovery with random values and console output. This is a deceptive capability claim that can mislead users, auditors, or integrators into trusting a nonexistent autonomous update pipeline.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The method documentation states that enabling auto-improvement causes the skill to research frontier AI, test integrations, validate changes, and propose updates, but the method only toggles a flag and prints status text. This mismatch creates false assurances about security and governance controls, which is risky in a security-sensitive agent framework context.

Intent-Code Divergence

Low

Confidence: 94% confidence
Finding: The script tells users the night cycle runs automatically every night, but it is only invoked directly from main(). This can mislead operators about scheduling, persistence, and expected autonomous behavior, causing misplaced trust in unattended functionality that does not exist.

Natural-Language Policy Violations

Medium

Confidence: 96% confidence
Finding: Presenting autonomous self-improvement as an operational fact without qualification is unsafe because users may believe the system can independently research and evolve safely under claimed controls. In agent tooling, exaggerated autonomy claims are more dangerous than in ordinary demo code because they can influence deployment decisions, trust boundaries, and oversight practices.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill explicitly promotes nightly scanning, testing, proposing updates, and 'auto-evolution' of itself without clear operator approval boundaries, review gates, or warnings about what may change. In a framework that claims persistent execution and self-modification, this creates real risk of unauthorized code, configuration, prompt, dependency, or behavior drift that could affect connected tools and data over time.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The quick-start and examples encourage immediate activation of a persistent council and 24/7 operation with no prominent warning about continuous resource usage, background actions, data processing, or downstream effects on connected services. Because the surrounding marketing emphasizes always-on agents, fallback execution, memory, and self-improvement, users may unknowingly launch long-lived autonomous processes in sensitive environments.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal