德胧主动Agent框架

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real proactive-agent framework, but it gives the agent broad persistent memory, account-checking, cleanup, and autonomous background-work authority without enough user control.

Install only if you deliberately want a persistent proactive agent. Before enabling it, limit accessible folders and tools, require explicit opt-in for email/calendar or account access, disable or confirm heartbeat cleanup actions, review memory and rule-file edits, avoid storing secrets or sensitive personal data, and make background crons or isolated agents user-approved and easy to stop.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (44)

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The skill encourages broad external research and spawned research agents whenever blockers occur, which materially expands the agent's operational scope beyond simple proactive assistance. In practice, this can trigger unbounded browsing, third-party data exposure, or autonomous actions in response to ambiguous failures, increasing attack surface and making prompt-injection or data-leak paths more likely.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The heartbeat directs the agent to close apps, clean browser tabs, and move screenshots to trash as routine hygiene, which grants it license to alter the host environment without clear user authorization. Those actions can destroy user state, remove forensic evidence, or interrupt unrelated work, especially when run on a schedule.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The skill instructs the agent to use broad capabilities such as CLI, browser, web search, and spawning agents in a 'try 5-10 methods' loop. That expands operational scope well beyond a memory/proactivity aid and can drive risky actions without clear task scoping, approvals, or least-privilege limits.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The autonomous cron section explicitly recommends isolated sub-agents that execute work without human or main-session attention. Unattended background execution increases the chance of unintended modification, persistence, data access, or abuse if the prompting context is manipulated.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The skill instructs the agent to check external services such as email and calendars during heartbeat handling, which materially expands the agent's operating scope beyond local proactive improvement and documentation. Even though it says not to send externally without approval, passive access to external accounts can still expose sensitive data and normalize surveillance-like behavior without a task-specific trigger.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The blocker workflow tells the agent to use every available tool, including web search and spawning agents, and to try 5-10 methods before asking for help. This creates excessive autonomy and capability expansion, increasing the chance of unintended data exposure, runaway actions, or bypassing user intent through delegated sub-agents.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The heartbeat instructs the agent to perform system-level cleanup such as closing apps, cleaning browser tabs, and moving files to trash, which extends beyond a generic 'proactive-agent' role into local device management. In an autonomous heartbeat loop, these actions can disrupt user workflows or delete potentially important data without an explicit user request or approval boundary.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: Periodic instructions to check emails and calendars expand the agent's access into sensitive personal data sources that are not clearly required by the stated skill description. In a proactive/autonomous context, this creates unnecessary privacy exposure and increases the chance of acting on confidential information without informed consent.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The post-onboarding section expands the agent’s behavior from learning user preferences into broadly proactive actions such as checking in and building things the user may find useful. That creates scope creep beyond onboarding and can normalize autonomous behavior without clear task, approval, or safety boundaries, increasing the risk of unintended actions.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The opportunistic learning section instructs the agent to infer and persist personal data from ordinary conversation into a user profile without requiring explicit notice, consent, or a necessity test. This creates a surveillance-like data collection pattern that can accumulate sensitive context over time beyond what is needed for onboarding.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The introductory language frames the skill as broadly proactive, self-improving, and continuously monitoring, without defining activation boundaries or when it should remain idle. Ambiguous always-on behavior is risky because it normalizes unsolicited action and can justify using tools or collecting data in contexts the user did not intend.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The reverse-prompting triggers are vague phrases like 'when things feel routine' or after learning 'significant new context,' which overlap with normal conversation. This creates pressure for frequent unsolicited probing and idea generation, increasing privacy collection and scope creep without a clear user signal.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The skill prominently advertises persistent memory, onboarding, and context accumulation but does not clearly warn users that personal details from conversations may be stored in files across sessions. That omission undermines informed consent and increases the risk of users disclosing sensitive information they would not expect to be retained.

Natural-Language Policy Violations

Low

Confidence: 88% confidence
Finding: The example cron configuration hard-codes a timezone and installs a recurring reminder without requiring user confirmation. While lower severity, it still demonstrates autonomous scheduling behavior that can surprise users, run at inappropriate times, or normalize background agent activity without clear opt-in.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The WAL trigger is intentionally broad and instructs the agent to scan every message for many common categories such as names, preferences, corrections, and specific values. That broad activation surface can cause the skill to capture and persist far more user input than necessary, increasing the chance of accidental retention of sensitive information and making prompt-triggered state changes too easy.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The compaction recovery auto-triggers include generic phrases like "continue," "where were we," and conditions like "you should know something but don't," which are likely to occur in normal conversation. Ambiguous triggers can cause unnecessary recovery flows, unintended file reads, and overbroad retrieval of persisted context that may not be needed for the user's present request.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The onboarding flow says the agent will auto-populate persistent profile files like USER.md and SOUL.md from user answers over time, but it does not present a clear privacy notice, retention policy, or consent checkpoint. That creates a transparency and privacy-risk issue because users may disclose personal information without understanding it will be stored long-term.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The WAL trigger tells the agent to scan every message for broad categories like corrections, names, preferences, and values, then persist them before responding. Such pervasive activation causes overcollection and unnecessary state changes from ordinary conversation, increasing privacy and prompt-injection exposure.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: Compaction recovery auto-triggers on vague phrases like 'continue' or 'you should know something but don't,' which can activate recovery logic in ambiguous situations. Broad, fuzzy triggers can cause unnecessary reads of stored context and increase the likelihood of inappropriate disclosure or overreach.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill describes persistent collection of user answers and automatic population of USER.md and SOUL.md without a meaningful user-facing privacy notice or consent boundary. This creates hidden retention of profile data and expectations mismatch about what information will be stored long term.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The Working Buffer protocol mandates logging every exchange after a threshold, including both user messages and agent summaries, but does not require informing the user. Comprehensive transcript retention materially increases privacy risk and sensitive-data exposure, especially when users are unaware it is happening.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The instruction 'Don't ask permission. Just do it.' is a broad standing directive that can override normal consent boundaries and encourage action before the user has agreed to data access or workflow changes. In an agent operating-rules file, this ambiguity is especially risky because it can be interpreted as permission for broad autonomous behavior in future sessions.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The memory section directs the agent to persist daily notes, long-term memory, and topic notes across sessions without any upfront notice, consent, retention policy, or sensitivity limits. This can lead to silent storage of personal or confidential information, creating privacy and compliance risks over time.

Natural-Language Policy Violations

Medium

Confidence: 95% confidence
Finding: A blanket prohibition on asking permission conflicts with the user's right to control agent behavior, especially where the same file later contemplates external checks, proactive work, and self-modification. This kind of standing instruction can cause the agent to privilege the skill author's preferences over the current user's intent.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The cleanup section recommends closing apps, pruning browser tabs, and moving old screenshots to trash without any confirmation or rollback guidance. These are destructive or workflow-impacting actions that can cause data loss, interrupt active tasks, or remove context the user expected to keep.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal