灵枢·AI全栈构建师

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real full-stack development assistant, but it needs Review because it bundles broad code execution, external AI-provider calls, and persistent file-writing behavior without enough scoping or user controls.

Install only if you are comfortable with a high-authority development assistant. Use it in a disposable or well-scoped workspace, avoid giving it secrets or production credentials, and review or disable the unrestricted code execution, external provider adapters, auto-backup script, and persistent logging/profile features before using it on sensitive projects.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
System Prompt LeakageDirect Leakage, Indirect Extraction, Tool-Based Exfiltration
Tool MisuseTool Parameter Abuse, Chaining Abuse, Unsafe Defaults
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger

Findings (22)

exec() call detected

High

Category: Dangerous Code Execution
Content: try: local_vars = {} exec(code, {"__builtins__": __builtins__, **self.tools}, local_vars) result["output"] = local_vars.get("result", "执行成功") except Exception as e: result["error"] = str(e)
Confidence: 99% confidence
Finding: exec(code, {"__builtins__": __builtins__, **self.tools}, local_vars)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: logging.info(f"执行命令: {command}") try: # 使用shell=True时，命令会被正确处理，包括包含空格的路径 result = subprocess.run( command, shell=True, cwd=cwd,
Confidence: 97% confidence
Finding: result = subprocess.run( command, shell=True, cwd=cwd, capture_output=True, text=True, check=True )

Context-Inappropriate Capability

Medium

Confidence: 85% confidence
Finding: The module creates and persistently stores synthesized personality profiles in a local JSON file without any clear necessity tied to the skill's stated purpose, consent flow, retention control, or access restrictions. In an agent environment, unexpected persistent storage can retain sensitive or copyrighted source-derived data and expand privacy risk, especially if materials originate from user input or shared workspaces.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The code promises fully isolated contexts, but read_from_space() explicitly falls back to parent_space, allowing data from another context to be read across space boundaries. In an agent skill that may process multiple tasks or users, this breaks the stated isolation guarantee and can expose prior task data to later operations or prompts.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: The module advertises complete context isolation, but closed spaces remain in the in-memory registry with all task data intact until opportunistic cleanup occurs. This means sensitive prompts, reasoning, and responses may remain accessible after a task is supposedly finished, undermining privacy and increasing cross-task exposure risk.

Context-Inappropriate Capability

High

Confidence: 95% confidence
Finding: This skill is presented as a full-stack development guidance/build assistant, but it embeds a general-purpose code execution capability that materially expands the attack surface beyond the stated purpose. In an agent context, this mismatch is dangerous because users or downstream systems may route untrusted prompts or generated code into the feature without expecting runtime execution risk.

Intent-Code Divergence

High

Confidence: 98% confidence
Finding: The documentation advertises a 'code execution sandbox', but the implementation later uses exec() with full builtins and no actual isolation controls. This is especially dangerous because it creates a false sense of safety, encouraging operators to trust a feature that can execute arbitrary Python in-process with host privileges.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: This file exposes a detailed inventory of user-local content with absolute filesystem paths under a specific home directory, including application support locations and skill storage structure. Even without direct code execution, this leaks sensitive host metadata that can aid fingerprinting, targeted prompt attacks against local resources, privacy violations, or follow-on attempts to access predictable files.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The Filebeat example hardcodes the Elasticsearch password as "changeme", directly contradicting the document's own guidance to manage sensitive information securely. Even as sample content, readers may copy this into real deployments, leading to credential exposure, insecure defaults, or accidental production use of trivial passwords.

Description-Behavior Mismatch

Medium

Confidence: 84% confidence
Finding: The script creates directories and writes backup artifacts outside the skill directory, which exceeds the stated scope of a developer guidance skill and introduces side effects on the host workspace. In an agent skill context, undeclared filesystem modification is risky because users may not expect persistent writes in parent directories.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The script runs local maintenance commands even though the skill is described as a development advisor rather than a system automation tool. In agent environments, undisclosed command execution increases risk because it can alter local state, invoke external tools, and expand the attack surface beyond simple code-generation assistance.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The documented trigger phrases are very broad, generic development terms such as '全栈开发', '架构设计', and '代码生成'. In a skill marketplace or agent router, this can cause unintended activation during ordinary software discussions, exposing users to autonomous workflow behavior, tool use, or content generation they did not explicitly request. The skill context increases risk because this is a high-scope development agent that claims end-to-end execution, PRD generation, code generation, and deployment guidance.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The trigger phrases are very broad and cover generic software tasks like development, architecture, code generation, and deployment, making accidental invocation likely. In a skill with implied high-power capabilities and automated multi-step workflows, overbroad activation increases the chance of unintended execution, context capture, or user confusion about which agent is acting.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: This section includes runnable web server examples and outbound HTTP usage without any safety note about network exposure, debug mode, or untrusted input handling. In this skill's context as a code-generation and full-stack guidance library, users may copy snippets directly into real projects, increasing the chance they deploy insecure defaults such as Flask debug mode or internet-reachable endpoints without safeguards.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The code writes personality data to disk silently, with no user-facing notice, opt-in, or storage policy. In this skill context, that is more dangerous because users would reasonably expect a development assistant, not hidden local persistence of synthesized profiles potentially derived from provided materials.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The workflow can send user input, extracted constraints, and synthesized reasoning to an external platform_adapter without any consent gate, disclosure, or redaction step. In a developer assistant context, users may provide source code, architecture details, credentials, or business-sensitive data, so silent transfer to third-party models creates a real confidentiality risk.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The code execution interface exposes arbitrary execution without any user warning, confirmation, authorization check, or policy enforcement. In a skill intended for development assistance, that makes accidental or prompt-induced execution of untrusted code much more likely, increasing the chance of host compromise or data exposure.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill explicitly instructs the agent to write execution logs and reusable patterns into repository files, but it does not require user consent, preview, opt-in, or safeguards around what gets persisted. In an agent setting, this can cause unintended modification of project state, leakage of sensitive execution context into tracked files, and silent pollution of the repository with AI-generated content that may later be committed or reused.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: This healthcare PRD template explicitly covers electronic medical records, data export, analytics, and diagnostic-support features, all of which involve highly sensitive regulated health data. Although it mentions compliance at a high level, it does not include concrete privacy-risk warnings, safety constraints, or mandatory requirements around least privilege, auditability, de-identification, export controls, and human oversight, which can lead downstream users to generate unsafe product requirements for PHI-handling systems.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The prompt-building path embeds raw user-derived content, constraints, and intermediate reasoning directly into plain-language prompts and internal records. If the input contains secrets, proprietary code, tokens, or personal data, this increases the chance of disclosure through logs, downstream model providers, debugging output, or later inspection.

Ssd 3

Medium

Confidence: 97% confidence
Finding: The end-to-end workflow stores full user input, parsed intent, thinking steps, decision data, and response together in a retained context object, effectively creating a sensitive transcript in memory. In a multi-task agent environment, such retained transcripts raise the risk of accidental reuse, unauthorized access, prompt leakage, and privacy violations.

Unsafe Defaults

Medium

Category: Tool Misuse
Content: ```env # .env PORT=3000 NODE_ENV=development MONGO_URI=mongodb://localhost:27017/myapp JWT_SECRET=your-secret-key ```
Confidence: 92% confidence
Finding: NODE_ENV=development

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal