DeepThinking Framework

Security checks across malware telemetry and agentic risk

Overview

This skill appears locally focused and not exfiltrating data, but it stores and reuses sensitive personal reflections and behavioral profiles with too little user control.

Install only if you intentionally want a local coaching skill that remembers personal reflections across sessions. Before using it for sensitive topics, review ~/.deepthinking storage, avoid sharing secrets or regulated information, disable any cron/systemd setup unless needed, and be prepared to manually inspect or remove stored memory and profile files. VirusTotal telemetry was pending and was not used as the basis for this verdict.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (28)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 96% confidence
Finding: The skill is presented as a guided thinking framework, but it also performs persistent behavioral profiling, long-term memory storage, archival, and self-improvement workflows that materially expand its data handling and autonomy. That mismatch can mislead users and operators about what data is collected and how it will be reused, undermining informed consent and risk assessment.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The documented nightly evolution and self-improvement workflow adds autonomous behavior beyond what a user would expect from a guided conversational framework. Even if changes require review, the analysis and proposal generation still process accumulated user data and increase the operational risk surface.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The documented nightly evolution and self-improvement workflow adds autonomous behavior beyond what a user would expect from a guided conversational framework. Even if changes require review, the analysis and proposal generation still process accumulated user data and increase the operational risk surface.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The skill both tells the agent to disclose retained behavioral notes and separately instructs it to silently use prior memories without announcing that use. This inconsistency creates a transparency failure where users cannot reliably know when prior sensitive disclosures are influencing the conversation.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The file explicitly frames the modules as 'psychological interventions' and then supplies directive, behavior-shaping prompts rather than neutral brainstorming support. In a stateful coaching skill, this can push users into emotionally manipulative interactions without consent, boundaries, or suitability checks, increasing the risk of distress or inappropriate pseudo-therapeutic guidance.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: These prompts use aggressive confrontation and coercive phrasing to pressure compliance, including shaming language and violent hypothetical framing. That exceeds ordinary problem-solving assistance and can produce emotional harm, especially for vulnerable users who may interpret the agent as authoritative or therapeutic.

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: The script goes beyond prompt-evolution and creates a persistent semantic user profile from historical memory data, storing inferred heuristics for future use. Persisting behavioral inferences increases privacy risk, enables profiling across sessions, and expands the system's data-retention scope beyond what the stated feature requires.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The code explicitly distills repeated user behavior into persistent heuristics such as 'stable truths about the user,' which is a profiling function rather than simple operational telemetry. Those inferences may be sensitive, inaccurate, or repurposed later, creating privacy and misuse risks disproportionate to a thinking-framework skill.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The TUI stores highly sensitive introspective user content into persistent state and memory during the session, but the skill description frames itself as a thinking framework rather than a data-collecting system. This creates a transparency and privacy problem because users may reveal personal fears, motivations, and life context without meaningful notice that the information is being retained.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The code automatically searches prior memory based on the user's topic and explicitly comments that the results are absorbed silently rather than disclosed. Hidden retrieval of prior personal context is dangerous because it undermines user expectations and can influence the conversation using sensitive historical data without informed consent.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The README promotes persistent storage of deeply reflective session content without prominently warning users that sensitive personal data will be retained on disk by default. In a psychology-style skill, this can expose intimate thoughts, fears, and plans to other local users, backups, incident responders, or malware if the host is shared or later compromised.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The skill describes silent long-term profiling that consolidates psychological inferences such as fears, risk tolerance, and behavioral heuristics across sessions, yet it does not require explicit informed consent or provide strong warnings. Because the stored data is unusually sensitive and inferential rather than merely conversational, unauthorized disclosure or misuse could cause significant privacy harm, reputational damage, or manipulation risk.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The activation phrases are broad everyday prompts such as 'help me decide' or 'what should I do about,' which can cause accidental activation in ordinary conversations. In this skill, accidental activation matters because it can trigger persistent state creation and memory collection for users who did not intend to enter a profiling workflow.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The skill description does not prominently warn that it stores persistent state, behavioral notes, long-term memory, and evolution data under ~/.deepthinking. Because the framework invites users into reflective, potentially sensitive discussions, failing to disclose persistent storage upfront materially increases privacy risk.

Missing User Warnings

High

Confidence: 99% confidence
Finding: The instructions explicitly direct the agent to silently incorporate relevant prior memories and themes without telling the user at the point of use. Hidden reuse of sensitive prior disclosures removes meaningful consent and can expose users to manipulation or unexpected inferences based on past conversations.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The document prescribes intense, psychologically forceful interaction patterns but provides no warning, consent mechanism, or guidance for safely handling distress. In a skill marketed as a thinking framework, users may not expect confrontational behavioral intervention, making the mismatch more dangerous.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The 'gun to your head' wording is a violent coercive metaphor that can be disturbing, retraumatizing, or unsafe for users with anxiety, trauma histories, or self-harm risk. Because it is written as a reusable prompt template, it normalizes harmful phrasing across interactions without any guardrail.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The semantic profile is written to disk without any user-facing disclosure or consent flow at the point of collection and persistence. Silent creation of long-lived behavioral profiles prevents informed consent and makes later access, misuse, or unexpected reuse more likely.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The store command writes user-provided tags and content into a long-lived plaintext file under the user's home directory without any consent prompt, warning, retention control, or sensitivity checks. In an agent context, this can silently persist personal, confidential, or regulated data across sessions, increasing exposure from local compromise, backups, logs, or later unintended retrieval.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The TUI loads an evolved profile and searches memory without any user-facing disclosure at session start. Because the content handled here is deeply personal and reflective, silent profile/memory use increases privacy risk and may cause users to unknowingly expose or be influenced by previously stored sensitive material.

Ssd 3

High

Confidence: 99% confidence
Finding: The skill instructs the agent to search and reuse prior personal memories and behavioral heuristics during future sessions without clear per-use consent. This creates a direct privacy risk because sensitive disclosures can silently influence later interactions, expanding exposure beyond the original conversational context.

Ssd 3

High

Confidence: 98% confidence
Finding: The framework directs the agent to persist detailed personal insights after meaningful exchanges, including fears, identity concerns, motivation patterns, and breakthroughs. In the context of a reflective coaching workflow, that amounts to systematic collection of potentially sensitive psychological and behavioral data with insufficient safeguards described.

Ssd 4

Medium

Confidence: 90% confidence
Finding: The staged excavation flow deliberately elicits deeper motivations, fears, and vulnerabilities, and later feeds those disclosures into hidden profiling and memory reuse. That conversational design increases the sensitivity of collected data and makes the downstream privacy risks more severe than in a generic task-oriented skill.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The natural-language design instructs the system to consolidate episodic memory into reusable behavioral heuristics for future sessions, which directly encodes persistent user modeling. In this skill context, that makes the capability more dangerous because the feature is framed as a thinking aid, not as a transparent profiling subsystem with clear safeguards.

Ssd 3

Medium

Confidence: 92% confidence
Finding: The skill is explicitly designed to accumulate and resurface user information over time with broad thematic search and connection features, but it defines no minimization, retention, or sensitivity boundaries. In a stateful assistant, that design materially raises privacy and security risk because unrelated future interactions can expose prior personal data beyond the user's immediate expectations.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal