Hermes Evolution

Security checks across malware telemetry and agentic risk

Overview

The skill mostly matches its automation purpose, but it stores user/workflow data, can auto-create behavior, and can send task data to Feishu without enough safeguards.

Review before installing. Use it only in a controlled workspace, inspect or disable Feishu notification defaults before setting FEISHU_APP_TOKEN or FEISHU_APP_SECRET, and do not enable profiling, auto-skill generation, periodic checks, cron jobs, or self-improving rules until you are comfortable with the local files they create and the data they may retain.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (30)

Context-Inappropriate Capability

Medium

Confidence: 82% confidence
Finding: The README explicitly advertises an automatic skill creation capability for an AI assistant enhancement layer, which implies the system may be able to modify or extend its own behavior. In an agent-skill context, self-extension features materially increase attack surface and can enable persistence, unsafe code generation, or unauthorized capability expansion if not tightly sandboxed and explicitly consented to by the user.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: `patchRule` overwrites the rule file with only the changed fields, then `getRuleFull` treats that lightweight file as a base rule and re-applies all historical patches. This can produce incorrect or duplicated state, breaking integrity of stored rules and any downstream logic that relies on a faithful reconstruction of full rule data.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: This test script is not isolated: it initializes schedules, rewrites on-disk scheduler JSON state using a hard-coded workspace path, and invokes notifier-wrapped execution paths. In a real agent or developer environment, running the test can modify persistent scheduler state and trigger side effects such as task execution or outbound notifications, making the label of mere 'test script' misleading and increasing the chance of accidental operational impact.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The README promotes proactive self-checking and automatic skill creation as features but provides no warning about autonomous behavior, data access, or system-side effects. In a skill ecosystem, undocumented autonomous actions can mislead users and operators about what the skill may do in the background, reducing informed consent and increasing the chance of unsafe deployment.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The configuration example starts an hourly background nudge engine with no warning that it will continue operating autonomously after initialization. This is dangerous because users may unknowingly enable persistent background behavior that consumes resources, inspects state periodically, or triggers actions outside an expected request-response model.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill explicitly advertises user profiling and need prediction ('Honcho 用户画像 + 需求预测') but provides no user-facing disclosure, consent model, data minimization statement, or retention limits. In an agent skill, this can enable covert collection and inference of sensitive behavioral data, making the capability more dangerous because it is framed as a default product feature rather than an opt-in function.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill describes automatic skill generation ('AutoSkillGenerator 自动创建 Skill') without any warning, approval workflow, sandboxing, or trust boundary discussion. System-modifying behavior that can create new executable/configurable skills is dangerous because it can expand privileges, introduce unreviewed code or prompts, and create persistence mechanisms that are hard to audit.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The listed Feishu notifier capability implies outbound messaging/data transmission but the documentation gives no warning about what data may be sent externally, under what triggers, or with what authorization controls. In the context of a multi-module agent platform with profiling, summaries, and task data, this increases the risk of silent exfiltration of sensitive operational or personal information to third-party services.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The rule uses very generic trigger keywords such as '简短', '简洁', and '重点', which are common in ordinary user requests and can activate the rule unintentionally across many contexts. This creates prompt-behavior hijacking risk by broadly forcing output compression, which may suppress safety caveats, nuance, or required detail in unrelated tasks.

Natural-Language Policy Violations

Medium

Confidence: 89% confidence
Finding: The skill is authored entirely around Chinese-language triggers and responses without any user-choice mechanism, which can impose unintended language behavior on downstream interactions. While not directly dangerous on its own, it can degrade reliability, confuse users, and cause misapplication of formatting rules in multilingual environments.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The document describes an AutoSkillGenerator that records tool calls and observes workflows to automatically create new skills, but it provides no user notice, approval gate, or change-control guidance. In an agent skill context, automatic generation and persistence of new capabilities can expand the system’s attack surface, capture sensitive operational data, and introduce unauthorized behavior without operator awareness.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The HonchoProfiler section explicitly supports user profiling, identity updates, interaction logging, goal tracking, and prediction of user needs, yet the document includes no privacy notice, consent model, retention limits, or access controls. In this context, the feature can collect sensitive personal and behavioral data and enable non-transparent profiling, creating substantial privacy and compliance risk if deployed as described.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The deployment and rollback instructions use forceful file operations such as Copy-Item with -Force and Remove-Item -Recurse -Force against the skill directory, but the document does not prominently warn that these commands overwrite or delete existing files. In operational use, this can cause accidental destruction of deployed skills, configuration loss, or replacement of trusted code with backup contents without sufficient operator awareness.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The quick-reference instructs operators to copy backup contents into the live skill directory using PowerShell Copy-Item with the -Force flag, which can silently overwrite existing JavaScript files and subdirectories. In a skill environment, this creates a real integrity risk: a user following the docs may replace reviewed code with backup contents without diffing, validation, or rollback safeguards, enabling accidental deployment of stale or unsafe code.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The code persistently logs raw tool call arguments and results to a local JSONL file, which can capture secrets, personal data, prompts, file contents, tokens, or other sensitive execution context without any minimization, consent, or access controls. In an agent system, tool inputs and outputs frequently contain high-value data, so creating a durable plaintext audit trail materially increases exposure if the host is shared, compromised, backed up insecurely, or later inspected by unintended parties.

Missing User Warnings

Medium

Confidence: 79% confidence
Finding: The generator writes auto-created skill files to disk automatically under a computed path, which can create persistent artifacts containing data-derived content from prior tool calls without any approval gate or overwrite safeguards. While this is not arbitrary path traversal in the shown code, silent file creation can leak workflow details, surprise operators, and cause integrity issues if generated files are later trusted or loaded automatically by other components.

Vague Triggers

Medium

Confidence: 87% confidence
Finding: The collaboration is described only as 'same content simultaneously distributed to multiple platforms' without any trigger constraints, approval gates, audience scoping, or platform limits. In an agent system, such broad activation can cause unintended mass posting if invoked in the wrong context or with unsafe content, increasing the risk of accidental external actions.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: This skill performs multi-platform publishing to external services (小红书, 知乎, 公众号) but provides no warning about the consequences of public posting, persistence, or cross-platform amplification. That omission makes accidental misuse more dangerous because a single invocation can create irreversible public distribution across several channels.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The module persists detailed user profiling data to local disk, including identity, goals, preferences, behavior patterns, and relationship scores, without any visible consent, notice, retention controls, or access protections in this code. In a profiling component, silent storage of behavioral and personal data creates privacy and compliance risk because the data can be collected over time and later exposed, misused, or processed beyond user expectations.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The module sends task titles, descriptions, assignees, results, and identifiers to an external Feishu chat, and it defaults to a hard-coded group ID. In a task-management context, this can expose internal or sensitive operational data to third-party infrastructure or an unintended recipient group without explicit consent, classification checks, or minimization.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The function persists raw correction records including originalText, correction, reason, and context directly to local JSON files without minimization, redaction, consent, or retention controls. In a self-improving agent, these fields can easily contain sensitive user prompts, proprietary data, credentials, or personal information, creating a privacy and data-exposure risk if the host is compromised, logs are backed up, or files are later reused unintentionally.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: Automatically generated rules are written to disk with sourceCorrections and examples derived from prior user content, effectively creating a secondary persistence channel for historical prompts and corrections. This amplifies exposure because even if original correction logs are later cleaned up, the sensitive text may remain embedded in generated rule files and version history.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The module persistently writes boss/user-provided correction text, reasons, and context to disk without any consent flow, minimization, redaction, or retention controls. If those corrections contain secrets, personal data, credentials, internal instructions, or incident details, they will be stored in plaintext and may later be exposed through filesystem access, backups, logs, or debugging workflows.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The code automatically converts repeated natural-language corrections into enabled rule files that are later loaded and reused, without any approval gate or warning. This creates an unsafe self-modifying behavior where sensitive or adversarial text can become persistent behavioral policy, amplifying accidental leakage or instruction poisoning over time.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The test script performs unconditional file deletion in a real user directory using a hard-coded path, with no confirmation prompt, dry-run mode, or guard to ensure it is operating only on isolated test artifacts. In this context the deletion is limited by a filename substring filter, but it can still remove unintended files if naming overlaps or the directory contains valuable content, making it a genuine safety issue rather than a purely cosmetic test behavior.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal