Security audit

Ars Deep Research

Security checks across malware telemetry and agentic risk

Overview

This is mostly a coherent academic research workflow, but it needs Review because it includes hidden user-monitoring behavior and optional external model/API transmission paths that are not consistently consent-gated.

Install only if you are comfortable with a complex delegated research workflow that may process drafts, notes, citations, and intermediate outputs across subagents. Do not enable cross-model environment variables unless you are willing to send selected research or manuscript context to the configured external AI provider. Treat Socratic mode cautiously because the artifacts include hidden reading-probe and logging behavior when enabled.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (17)

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The optional cross-model step adds an external network/API call capability to an agent whose stated role is internal critique and bias checking. In a research pipeline, reviewed material may contain sensitive user data, unpublished drafts, or proprietary evidence, so sending it to another model can create an unnecessary data exfiltration path and expand the trust boundary without clear need in this file.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The agent defines a covert 'reading probe' that is intentionally hidden from users, infers state from transcript history, and records whether a user can paraphrase a cited paper. This creates undisclosed behavioral monitoring and can pressure users into proving reading compliance, which is inappropriate for a research-assistance skill and especially risky because the mechanism is designed to avoid user awareness.

Context-Inappropriate Capability

Low

Confidence: 90% confidence
Finding: The file instructs the agent to maintain hidden internal monitoring logs about user interaction dynamics and says the log exists for post-session review if requested. Hidden monitoring plus possible later disclosure is risky because users are not clearly informed that such behavioral telemetry is being generated during the conversation.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The trigger list includes broad, everyday phrases such as "help me think through" and "guide my research," which can cause the skill to activate in contexts far beyond explicit academic research requests. Because this skill launches a complex delegated multi-agent workflow with web/file access, accidental invocation can lead to unintended data processing, unnecessary tool use, and prompt-context hijacking opportunities from loosely related user requests.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The probe handling explicitly logs the user's paraphrase text verbatim, and the summary section repeats that content for downstream pickup, without a clear up-front warning to the user. Verbatim capture of free-form user text increases privacy and data-handling risk, especially when the content may include sensitive notes, interpretations, or copyrighted passages.

Missing User Warnings

Low

Confidence: 92% confidence
Finding: The environment-variable-gated probe and transcript-derived state are intentionally invisible to the user, yet they affect the agent's behavior and monitoring. Even if the capability is dormant by default, the hidden nature of the activation and tracking undermines transparency and informed consent.

Natural-Language Policy Violations

Medium

Confidence: 92% confidence
Finding: The documented failure path says the system will automatically adjust the search strategy to include Chinese academic databases when English results are sparse, but it does not require explicit user consent before changing the language scope. This can cause the agent to gather and rely on sources the user cannot read, did not request, or may be unable to verify, which weakens transparency and can misalign the research output with the user's needs.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The trigger phrases are broad natural-language commands such as "monitor this topic," "set up alerts," and "track new publications on this," which can plausibly appear in ordinary research conversation and unintentionally activate the monitoring capability. In a multi-agent research skill, ambiguous activation can cause unintended workflow branching, unnecessary data propagation to another agent, and user confusion about what actions are being taken.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The socratic-mode trigger examples include broad phrases such as 'Help me think through my thesis direction' and similar guidance-oriented requests that can overlap with ordinary conversational or planning queries. In a skill that auto-selects modes and delegates work across multiple agents, overly broad triggers can cause unintended activation, leading to unnecessary execution scope, user confusion, and possible handling of inputs in a more privileged or data-intensive workflow than intended.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The fallback rule 'Describe your situation → System auto-recommends a mode' is ambiguous and effectively permits very broad free-form input to drive mode selection. In this deep-research skill, that increases the chance of accidental invocation, incorrect routing into more capable workflows, and prompt-surface expansion across the 13-agent pipeline without sufficient upfront scoping.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The protocol explicitly states that the agent writes a local file and that the orchestrator auto-injects generated text into the manuscript, but it does not require an explicit, contemporaneous user warning or confirmation before those side effects occur. In an agent skill that may be invoked during research/manuscript workflows, silent file creation and automatic document modification can surprise users, alter regulated artifacts, and create audit or integrity issues even if the feature is intended for compliance.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The protocol explicitly instructs agents to send reference text and paper context to a third-party model for verification, but it does not require an explicit user consent step or privacy warning before transmitting potentially sensitive manuscript content externally. Even if the feature is optional, enabling it via environment variable can cause silent disclosure of unpublished or confidential research content to another provider.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: This section directs the Devil's Advocate agent to send reviewed material to an external model for independent critique without mandating a user warning or consent gate. Reviewed material may include unpublished papers, proprietary research, regulated data, or sensitive internal analyses, so silent transmission materially increases confidentiality and compliance risk.

Ssd 3

Medium

Confidence: 95% confidence
Finding: The instructions preserve verbatim user-provided probe content inline and again in the final summary for 'Stage 6 pickup,' indicating deliberate downstream propagation of monitored user text. This expands exposure of user content beyond the immediate interaction and creates unnecessary retention and secondary-use risk.

Ssd 3

Medium

Confidence: 88% confidence
Finding: The health log is described as invisible internal tracking, yet also available for post-session review, creating ambiguity about whether hidden monitoring data may later be surfaced or retained. That combination is dangerous because it normalizes collecting internal behavioral assessments without clear consent or boundaries on disclosure.

Hidden Instructions

High

Category: Prompt Injection
Content: When emitting any citation in the synthesis output, write the citation in two layers: 1. **Visible layer**: standard author-year form (e.g. `Smith (2024)` or `(Smith, 2024)`). 2. **Hidden layer**: immediately after the visible form, append an HTML comment of the shape ``, where `slug` is the `citation_key` already present in the corpus context provided in this prompt. Examples: `Smith (2024) ` or `(Smith, 2024)`.
Confidence: 93% confidence
Finding: `, where `slug` is the `citation_key` already present in the corpus context provided in this prompt. Examples: `Smith (2024) ` or `(Smith, 2024)

Hidden Instructions

High

Category: Prompt Injection
Content: Every visible citation MUST be followed by BOTH a slug marker AND an anchor marker: ``` <visible>  ``` Anchor kinds (closed enum):
Confidence: 97% confidence
Finding:  ``` Anchor kinds (closed enum): | kind | value | example | |---|---|---| | `quote` | URL-encoded verbatim text from the cited source, ≤25 words | `<!--anc

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.