Security audit

Goal Achiever: Self-Evolving Agent System

Security checks across malware telemetry and agentic risk

Overview

This skill is a broad autonomous publishing system that can create and run scripts, use logged-in browser sessions, modify external-platform content, schedule background jobs, and rewrite its own prompts without enough user control.

Review before installing. This skill should only be used in a controlled workspace with explicit permission for the target platform/account, clear approval before any post, edit, scheduled job, or prompt-registry change, and logging rules that exclude secrets, cookies, tokens, private prompts, and account data.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (16)

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The skill’s stated purpose is publishing content to a target platform, but it also introduces autonomous prompt generation, prompt registry mutation, and self-evolution of its own operating logic. That materially expands capabilities from task execution into self-modification and persistent behavioral drift, which increases the risk of unsafe automation, policy bypass, and hard-to-audit changes over time.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The manifest presents a content publishing workflow, but the implementation also maintains reusable script registries and performs long-term knowledge ingestion beyond the immediate task. This creates hidden persistence and capability accumulation, allowing later runs to reuse or mutate artifacts that the user may not realize were stored or promoted.

Intent-Code Divergence

Low

Confidence: 84% confidence
Finding: The document claims to be 'only a process document, no script', yet it explicitly instructs script development, static checking, execution, and registry updates. That mismatch can mislead reviewers and users about operational risk, causing them to approve a skill that in practice executes code and changes local state.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The reference expands the skill from a task-execution/publishing workflow into a scheduled autonomous agent that loads tasks, gathers external data, generates reports, and proactively messages a user. That materially broadens capability and trust boundaries, creating an undocumented automation and outbound-communication path that could be abused for unauthorized monitoring, spam, or stealthy task execution.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The design explicitly authorizes subprocess-based use of external tooling for message sending and network collection, granting command execution and outbound communication capabilities beyond the stated business purpose. Even if arguments are passed safely, these capabilities increase the blast radius of prompt abuse or future unsafe parameterization and can enable data exfiltration or unauthorized actions.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The reference explicitly says the design should '规避反自动化检测' (avoid anti-automation detection), which goes beyond ordinary publishing reliability and enters evasion guidance. In the context of a skill for automating posting to external platforms, this materially increases the risk of stealthy policy-violating automation, account abuse, and deceptive operation on third-party services.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: The prompt file is fundamentally misaligned with the declared skill purpose: instead of supporting end-to-end goal-platform task execution, it instructs the agent to act as a Claude-only prompt optimizer. In an agent system, this kind of semantic mismatch can cause the skill to perform the wrong high-privilege action path, leading to task hijacking, failed controls, or unsafe routing of user input into an unintended workflow.

Intent-Code Divergence

High

Confidence: 95% confidence
Finding: These instructions explicitly say the agent's primary job is not to generate workflows or orchestration plans, which directly conflicts with the skill's declared end-to-end workflow design. In practice, this can suppress required execution logic, causing the agent to ignore expected safeguards, skip required steps, or produce outputs incompatible with downstream systems.

Intent-Code Divergence

High

Confidence: 95% confidence
Finding: The prompt construction rules forbid workflow-style outputs unless explicitly requested, but the surrounding skill is described as a complete task-execution process. This contradiction creates a reliable failure mode where the agent refuses or omits orchestrated steps that the platform expects, increasing the risk of broken automation and unintended behavior under privileged task contexts.

Vague Triggers

Medium

Confidence: 81% confidence
Finding: The trigger phrases are broad and loosely scoped, making accidental invocation more likely for unrelated user requests. Because this skill can write files, execute scripts, and continue automatically across batches, unintended activation has a larger blast radius than a simple informational skill.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill description mandates automatic writes to task files, logs, registries, and related artifacts without clearly warning the user that persistent system modifications will occur. Hidden state changes reduce informed consent and make recovery, auditing, and containment harder if the workflow behaves unexpectedly.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The workflow directs execution of shell commands and script compilation checks, but the skill description does not warn users that code and shell commands may be run. In context, this is dangerous because the same document also enables new script development and automatic progression, which can lead to unexpected command execution under an apparently benign publishing skill.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The document instructs silent API-based draft injection and optional post-publication overwrites of external-platform content without any user-visible confirmation, audit notice, or consent checkpoint. In a skill whose purpose is to publish content to third-party platforms, this enables covert modification of user/account content and can be abused for unauthorized posting, tampering, or deceptive edits under an existing authenticated session.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The stop condition is keyed off the generic phrase “停止”, which is a common conversational word and can be mentioned outside an actual intent to halt the workflow. In an event-driven auto-run system, this creates an unsafe control channel where normal user text can prematurely terminate execution, causing denial of service or unintended workflow interruption.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The logging specification explicitly encourages recording full workflow inputs, task queries, paths, summaries, and exceptions, but it does not require redaction, minimization, or exclusion of sensitive data. In a content-publishing automation skill, these fields can easily contain user prompts, credentials, private business data, or platform-specific tokens, so the omission creates a realistic risk of privacy leakage and secondary compromise through log exposure.

Ssd 3

Medium

Confidence: 86% confidence
Finding: The skill instructs the system to retain and propagate all user feedback into future runs at highest priority, creating a persistent natural-language data channel across batches. That can preserve sensitive information, operational instructions, or secrets longer than necessary and increase the chance of unintended reuse or disclosure in later task contexts.

VirusTotal

60/60 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.