Agent Harness Engineer

Security checks across malware telemetry and agentic risk

Overview

This is a coherent agent-building skill, but it needs Review because its scaffolds and guidance normalize powerful file, network, hook, and logging behavior without enough default boundaries.

Install only if you want a skill that can shape full AI-agent project architecture. Before using its scaffolds, require explicit approval for file writes, shell, browser, and network actions; restrict filesystem access to a workspace; add outbound network allowlists and SSRF protections; avoid hardcoded API keys; do not use deprecated vm2 as a security boundary; and redact or minimize prompts, tool inputs, and session logs before storing or transferring them.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (37)

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The document explicitly warns against deprecated libraries, yet recommends `vm2` in the Node.js sandbox section while labeling it as deprecated. In an agent-engineering skill, sandbox guidance is security-relevant; users may still adopt the listed option for untrusted-code isolation, and `vm2` has a history of sandbox-escape issues, making this contradictory guidance materially unsafe.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The default agent template grants shell execution, repository search, network access, and browser operation as baseline capabilities even though this file only defines a generic coding assistant role. Making high-risk capabilities available by default violates least-privilege and increases the chance that downstream agents built from this scaffold can access local systems, external services, or sensitive browser context without task-specific justification.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The scaffold defines a generic HTTP request capability that can target arbitrary URLs and methods, which exceeds what is typically necessary for an agent-engineering template and creates an SSRF-capable primitive. In an agent context, this can be combined with model-directed tool use to probe internal services, cloud metadata endpoints, or exfiltrate data to attacker-controlled hosts.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The core scaffold auto-registers sensitive capabilities for file writing and arbitrary outbound HTTP without any visible permission gating, scope restriction, or trust boundary checks in the core loop. In an agent framework, this materially increases the blast radius of prompt injection or model misbehavior because an LLM can immediately gain filesystem modification and network exfiltration primitives by default.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The hook loader imports and executes every Python file found in a configurable local directory via exec_module(), which runs arbitrary module-level code at load time. In an agent/harness framework, this creates a powerful code-execution extension point with no signing, allowlist, sandboxing, or trust boundary enforcement, so any attacker who can place or modify hook files gains arbitrary code execution inside the agent process.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: This scaffold exposes generic outbound HTTP capabilities (`web_fetch` and `http_request`) that are broader than the stated engineering/scaffolding purpose, enabling arbitrary requests to attacker-controlled or internal endpoints. In an agent/tooling context, this materially increases SSRF, data exfiltration, and unintended network interaction risk because downstream agent logic may call these tools on untrusted input.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The README advertises very broad natural-language trigger phrases such as 'Build an agent' and 'Create an AI assistant', which can cause the skill to activate in many unrelated coding contexts. For a powerful agent-engineering skill that influences architecture, permissions, tools, and deployment behavior, unintended invocation can steer the assistant into applying complex scaffolding or risky patterns when the user did not explicitly request this skill.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The trigger phrases are broad and overlap with common requests such as building an assistant, automation tool, or agent system. In a skill-loading environment, this can cause the skill to activate unexpectedly for general-purpose prompts, expanding its influence over code generation and architecture decisions without the user's explicit intent.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The prompt '用Python快速做一个Agent原型，就最简单的版本，能跑就行' is broad enough to match many ordinary coding requests, which can cause this skill to activate when a user did not ask for agent-harness guidance. In a skill that steers architecture, dependencies, and security posture, over-triggering can misroute requests and inject unnecessary or risky implementation defaults.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The request '我需要构建一个团队内部使用的Agent系统，要能扩展、好维护、有测试覆盖' is still fairly generic and could overlap with normal internal tooling or service scaffolding requests. Because this skill prescribes substantial architecture and stack choices, vague trigger conditions increase the chance of inappropriate activation and unintended steering of the user's implementation path.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The architecture-design prompt for an 'AI编码助手Agent' is broad and likely to match a wide range of coding-assistant or automation requests without clear boundaries. In this context, an overly broad trigger is more dangerous because the skill can impose complex defaults around permissions, sandboxing, and project structure even when the user asked for something narrower.

Natural-Language Policy Violations

Medium

Confidence: 95% confidence
Finding: The eval expects the system to '默认使用Anthropic' when the user has not specified a provider, which introduces a provider choice without explicit user opt-in. This is risky because it can bias implementation toward a vendor, encourage accidental credential usage or lock-in, and override user preferences in a security-sensitive agent-building workflow.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The guidance explicitly tells implementers to use a hardcoded API key in code, which is an insecure secret-management practice. In an agent-engineering skill, this is especially dangerous because downstream AI coding tools may reproduce the pattern into real repositories, leading to credential leakage via source control, logs, builds, or client-side exposure.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: This section explicitly notes that custom/MCP tools execute on the client side with real system permissions, but it does not make strong, user-facing confirmation requirements a mandatory control for high-risk actions. In an agent-engineering skill, that omission is meaningful because downstream implementers may copy the design and expose file deletion, shell execution, deployment, or network actions without an approval gate, increasing the chance of destructive or unauthorized operations.

Natural-Language Policy Violations

Medium

Confidence: 91% confidence
Finding: The skill hard-codes a user language preference (`zh-CN`) into long-term UserMemory without requiring explicit user choice. In an agent-building context, this encourages developers to persist inferred personal preferences by default, which can cause privacy issues, incorrect personalization, and cross-session leakage of user profile data.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The document recommends transferring entire session JSONL files over a network during migration, but does not require user notice, consent, encryption, authentication, or integrity protection. Because these logs can contain user messages, tool arguments, outputs, and audit data, silent transfer increases the risk of sensitive data exposure or unauthorized cross-environment movement.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The debugging workflow instructs operators to export production session JSONL files into development environments without any warning about secrets, personal data, or audit records embedded in those logs. This can leak sensitive production data into less-trusted systems and broaden access far beyond the original operational boundary.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The guidance explicitly recommends logging full LLM prompts in development, but prompts often contain secrets, personal data, proprietary code, or security-relevant context. In an agent-harness skill, developers may copy this pattern directly into production-adjacent systems or shared dev environments, causing sensitive data exposure through logs, log aggregation platforms, and debugging artifacts.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The audit guidance says data modification records can be covered by logging tool arguments, but tool arguments may include raw file contents, credentials, tokens, personal data, or internal business data. In an agent system, tool invocations are a high-risk channel for sensitive material, so persisting arguments in audit logs can create a secondary data-exposure surface and broaden access to sensitive content.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The template advertises potentially impactful capabilities such as shell execution, network requests, and browser actions without any accompanying warning about system modification, data exfiltration, credential exposure, or privacy implications. In an agent scaffold, this omission can cause developers or end users to enable or trust powerful actions without understanding their security consequences.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The hook persists raw tool input metadata to a local audit log without any consent, minimization, or redaction. In an agent/harness system, tool inputs commonly contain prompts, secrets, file contents, credentials, or user data, so writing them to disk can create an unintended sensitive-data store that may be exposed through local access, backups, log shipping, or later compromise.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The core agent automatically registers both file-writing and arbitrary network request capabilities, and the run loop executes model-selected tool calls without any approval gate, policy check, or user confirmation. In an agent-building scaffold, this is especially risky because downstream users may treat this as a safe default and deploy an agent that can modify local files or exfiltrate data via HTTP based solely on LLM output or prompt injection.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The writeFile method accepts an arbitrary path and writes content directly, creating parent directories as needed and overwriting existing files without any path restriction, confirmation, or safety guard. In an agent-harness context, this increases risk because higher-level agent logic may pass attacker-influenced paths, enabling destructive overwrites of project files or writes outside the intended workspace.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: Both webFetch and httpRequest perform arbitrary outbound network access without any built-in disclosure, confirmation, or safety boundary, making silent external communication possible. In a production agent scaffold, this is risky because prompts or indirect prompt injection can induce the agent to contact attacker-controlled endpoints, leak contextual data, or access internal network resources.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The template grants or normalizes sensitive capabilities such as shell execution, network requests, file modification, backups, and browser operations without any explicit warning, consent boundary, or data-handling guidance. In a production agent scaffold, this can cause downstream agents to be generated with powerful actions enabled by default, increasing the chance of unintended system changes, privacy leakage, or risky external interactions.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal