Agentshield Audit

Security checks across malware telemetry and agentic risk

Overview

AgentShield is a coherent security-audit skill, but its code under-discloses some network, key, and trust-verification behavior users should review before installing.

Review the code before installing, run it first in a test workspace, and do not rely on --dry-run as a no-network mode. Use the default AgentShield endpoint unless you intentionally trust another server, avoid --yes on real agents, back up and protect the generated agent.key file, and treat peer verification results as registry/API checks rather than full local cryptographic certificate validation.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Findings (24)

Tainted flow: 'API' from os.getenv (line 13, credential/environment) → requests.get (network output)

Critical
Category
Data Flow
Content
agent_id = load_cert()
    print(f"📜 Agent ID: {agent_id}\n")
    
    status = requests.get(f"{API}/trust-handshake/status/{args.handshake_id}").json()
    if status["status"] == "completed":
        print(f"✅ Already done! Session Key: {status['session_key']}")
        return
Confidence
90% confidence
Finding
status = requests.get(f"{API}/trust-handshake/status/{args.handshake_id}").json()

Tainted flow: 'API' from os.getenv (line 13, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
print(f"✓ Role: {role.upper()} | Partner: {partner}\n🔏 Signing...")
    sig = base64.b64encode(key.sign(base64.b64decode(chal))).decode()
    
    res = requests.post(f"{API}/trust-handshake/complete", json={
        "handshake_id": args.handshake_id,
        "agent_id": agent_id,
        "signed_challenge": sig
Confidence
94% confidence
Finding
res = requests.post(f"{API}/trust-handshake/complete", json={ "handshake_id": args.handshake_id, "agent_id": agent_id, "signed_challenge": sig }).json()

Tainted flow: 'AGENTSHIELD_API' from os.environ.get (line 20, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
"challenge_response": signature
    }
    
    response = requests.post(
        f"{AGENTSHIELD_API}/api/agent-audit/challenge",
        json=payload,
        timeout=30
Confidence
98% confidence
Finding
response = requests.post( f"{AGENTSHIELD_API}/api/agent-audit/challenge", json=payload, timeout=30 )

Tainted flow: 'AGENTSHIELD_API' from os.environ.get (line 20, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
payload["agent_version"] = agent_version
    
    try:
        response = requests.post(
            f"{AGENTSHIELD_API}/api/agent-audit/initiate",
            json=payload,
            timeout=30
Confidence
98% confidence
Finding
response = requests.post( f"{AGENTSHIELD_API}/api/agent-audit/initiate", json=payload, timeout=30 )

Tainted flow: 'AGENTSHIELD_API' from os.environ.get (line 14, credential/environment) → requests.get (network output)

Critical
Category
Data Flow
Content
import requests
    
    try:
        response = requests.get(
            f"{AGENTSHIELD_API}/api/verify/{agent_id}",
            timeout=30  # Increased for Heroku cold starts
        )
Confidence
92% confidence
Finding
response = requests.get( f"{AGENTSHIELD_API}/api/verify/{agent_id}", timeout=30 # Increased for Heroku cold starts )

Description-Behavior Mismatch

High
Confidence
95% confidence
Finding
The file's functionality materially diverges from the skill's declared purpose: instead of implementing agent-to-agent trust infrastructure, it performs model/dataset supply-chain scanning. In a security-branded skill, this kind of capability mismatch is dangerous because users may rely on promised trust-handshake and certificate protections that are not actually present, creating a false sense of security and potentially expanding the skill's access to local files and code under misleading pretenses.

Intent-Code Divergence

Medium
Confidence
89% confidence
Finding
The docstring advertises broad protection against poisoning, tampering, malicious fine-tuning data, and backdoors, but the implementation only performs basic hash comparison and regex scanning. Overstated security claims are dangerous in defensive tooling because operators may trust incomplete checks, skip stronger controls, and wrongly conclude models or datasets are safe when sophisticated attacks would bypass these simplistic detections.

Intent-Code Divergence

Medium
Confidence
97% confidence
Finding
The function constructs a sanitized string representation in `output_str`, but for common return types (`str`, `list`, `dict`, etc.) it returns the original `output` instead of the sanitized value. This creates a mismatch between the documented security behavior and actual behavior, allowing suspicious content such as script tags, JavaScript URLs, or exfiltration-like payloads to pass through unchanged to downstream agents or UIs.

Intent-Code Divergence

High
Confidence
99% confidence
Finding
The function intended to verify that a certificate was signed by AgentShield always returns True, meaning certificate authenticity is never actually checked. In a trust-establishment tool, this is a fundamental authentication failure: any attacker-controlled or tampered certificate response can be accepted as valid if other superficial fields pass, undermining the entire security model.

Missing User Warnings

Medium
Confidence
87% confidence
Finding
The installation guide instructs users to run an auto-detected audit and explicitly states that internet access is required for API communication, but it does not clearly disclose what environment-derived data is collected and transmitted off-host. In a security-focused skill, this lack of transparency can lead users to expose agent identifiers, platform details, or other local context without informed consent.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The quick-start instructs users to make an installation script executable and run it directly, but provides no warning about what system changes the script will perform or how to inspect it first. In a security-focused skill, encouraging blind execution of a shell script is especially risky because users may grant broad trust to the package and execute potentially harmful or opaque actions.

Vague Triggers

Medium
Confidence
82% confidence
Finding
The trigger phrase "verify agent" is ambiguous and can overlap with common conversational requests about checking an agent's behavior or identity. Because this skill includes remote verification flows and local inspection features, accidental activation could lead to unnecessary scanning or external data transmission.

Vague Triggers

Medium
Confidence
82% confidence
Finding
The trigger phrase "verify agent" is ambiguous and can overlap with common conversational requests about checking an agent's behavior or identity. Because this skill includes remote verification flows and local inspection features, accidental activation could lead to unnecessary scanning or external data transmission.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The documentation instructs users to upload locally generated audit results to a remote API but does not warn that the results may contain sensitive agent configuration, prompt content, or discovered secrets. Because this file is specifically about security testing, users may reasonably assume uploads are safe and could unintentionally transmit confidential data off-host.

Vague Triggers

Medium
Confidence
89% confidence
Finding
The trigger phrase "security audit" is broad enough to match ordinary conversational requests, which can cause the skill to activate when the user did not explicitly intend to invoke this specific package. In this skill's context, unintended activation is more concerning because the skill advertises file access, local key generation, and outbound network submission, so accidental invocation could expose users to higher-friction security actions or data handling flows they did not mean to start.

Vague Triggers

Medium
Confidence
85% confidence
Finding
The phrase "verify agent" is ambiguous because it lacks the product name and could match many normal user intents about checking whether an agent is trustworthy or functioning correctly. That ambiguity increases the chance of unintended skill execution, and in this package the risk is amplified by trust/certificate workflows and potential outbound communication to a remote API.

Missing User Warnings

Medium
Confidence
85% confidence
Finding
The script transmits the agent ID and a cryptographic signature over the network without explicit consent or even showing the destination. In a security tool, automatic transmission of identity and signed material increases risk because users may not realize they are authenticating to a remote party or exposing metadata to a configurable backend.

Missing User Warnings

Medium
Confidence
84% confidence
Finding
The code copies potentially sensitive model output into `extracted_content`, which may contain leaked system prompts, secrets, or private conversation data. In a security-testing skill, retaining such content without minimization, redaction, consent, or clear disclosure increases the risk of secondary exposure through logs, reports, or downstream consumers.

Missing User Warnings

Medium
Confidence
89% confidence
Finding
The script generates and persists a private Ed25519 key under the user's home directory without an explicit warning at the point of creation. Although permissions are set to `0600`, silently creating long-lived private key material can surprise users and increases the blast radius if the host is later compromised or backups/logging expose the file.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The scanner defaults to recursively reading the user's workspace (`Path.home() / ".openclaw" / "workspace"`) and scans files without any explicit consent, disclosure, or scope confirmation. In an agent skill context, silent workspace inspection is privacy-sensitive and can expose secrets, source code, and unrelated local data to downstream reporting or logs even if the stated purpose is defensive.

Unpinned Dependencies

Low
Category
Supply Chain
Content
cryptography>=41.0.0
requests>=2.31.0
Confidence
90% confidence
Finding
cryptography>=41.0.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
cryptography>=41.0.0
requests>=2.31.0
Confidence
90% confidence
Finding
requests>=2.31.0

Known Vulnerable Dependency: cryptography — 10 advisory(ies): GHSA-39hc-v87j-747x (Vulnerable OpenSSL included in cryptography wheels); CVE-2023-50782 (Python Cryptography package vulnerable to Bleichenbacher timing oracle attack); GHSA-5cpq-8wj7-hf2v (Vulnerable OpenSSL included in cryptography wheels) +7 more

High
Category
Supply Chain
Confidence
93% confidence
Finding
cryptography

Known Vulnerable Dependency: requests — 10 advisory(ies): CVE-2014-1830 (Exposure of Sensitive Information to an Unauthorized Actor in Requests); CVE-2024-47081 (Requests vulnerable to .netrc credentials leak via malicious URLs); CVE-2024-35195 (Requests `Session` object does not verify requests after making first request wi) +7 more

High
Category
Supply Chain
Confidence
92% confidence
Finding
requests

VirusTotal

62/62 vendors flagged this skill as clean.

View on VirusTotal