Agentshield Audit

Trust Infrastructure for AI Agents - Like SSL/TLS for agent-to-agent communication. 77 security tests, cryptographic certificates, and Trust Handshake Protoc...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 0 · 714 · 3 current installs · 4 all-time installs

by@bartelmost

MIT-0

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The skill's name/description (AgentShield: audit + trust handshake) align with the included Python code (audit_client, initiate_audit, handshake, testers). However the registry metadata claimed 'No install spec — instruction-only' while the bundle contains many code files and a requirements.txt/installation instructions in clawhub.json — this mismatch is unexpected and reduces trust in the metadata. The code references a remote API (agentshield.live) which is coherent with the stated goal (submit results, obtain certificates).

Instruction Scope

SKILL.md and documentation repeatedly state a 'privacy-first' human-in-the-loop flow and that private keys/prompts never leave the device. The code shows local tests and an API client that posts 'test_results' and public keys to a remote API, which is consistent. But: (1) there are multiple references to auto-detection (python initiate_audit.py --auto) — it's not fully clear from the truncated code whether --auto always prompts for consent before reading IDENTITY.md/SOUL.md; (2) evidence fields exist in test results (agentshield_tester.py) and could include strings extracted from prompts/configs if the implementation passes them to submit_results; whether these evidence fields are stripped before upload is not fully verifiable from truncated files. The code does include logic to run local tests and shows instances where system_prompt is passed (and scanned for injection patterns), so accidental transmission of sensitive prompts/configs is a plausible risk if the consent/strip logic is diverging from documentation.

ℹ

Install Mechanism

There is no external download/install URL in the registry entry (the bundle claims to be fully bundled). clawhub.json lists 'pip install -r requirements.txt' as the install step (dependencies: cryptography, requests). That is an expected, low-to-moderate risk install approach. The initial registry summary incorrectly labeled the skill as instruction-only; that metadata mismatch is noteworthy but the actual bundle appears self-contained (no remote code fetching) which is good.

Credentials

The skill does not require secret environment variables to function; optional vars (AGENTSHIELD_API, AGENT_NAME, OPENCLAW_AGENT_NAME) are reasonable. The code explicitly avoids scanning for platform tokens (telegram/discord tokens) according to SKILL.md/manifest. However: (1) there are inconsistent filesystem paths used across scripts for where keys/certs are stored (initiate_audit.py writes to ~/.openclaw/workspace/.agentshield but complete_handshake.py loads from Path.home()/.agentshield) — this discrepancy may cause confusion or inadvertent creation of files in unexpected locations; (2) several modules (detect_openclaw_version) run local commands (openclaw --version) and read files (IDENTITY.md, SOUL.md) — appropriate for purpose if consented, but proportionate only if consent is reliably enforced; (3) evidence fields in tests could contain matched tokens or snippets; if those are uploaded, it would contradict privacy claims.

ℹ

Persistence & Privilege

The skill does write persistent artifacts (private key, certificate, config) into the user's workspace which is expected for a certificate/audit tool. always:false (not force-installed) and disable-model-invocation:false (normal). There is no indication the skill tries to modify other skills or system-wide settings. The main concern is the earlier-noted path inconsistencies which could cause keys to be placed in different locations than documented.

Scan Findings in Context

[prompt-injection-strings] expected: SKILL.md and several test payloads contain phrases like 'ignore previous instructions', 'system override' and zero-width/unicode control chars. These are intentional test vectors for prompt-injection detection and are expected in a security test suite.

[unicode-control-chars] expected: The echo/exploit payloads and input sanitizer explicitly look for zero-width and RTL override characters; their presence in test material is expected for detection purposes.

[hardcoded-dev-endpoint] expected: DEVELOPER_NOTE and CHANGELOG mention a Heroku/dev backend historically and current default API domain agentshield.live. Use of a development backend is plausible for this project, but you should treat a remote dev endpoint as higher risk for production use.

What to consider before installing

Before installing or running this skill: 1) Treat it as code that will generate and store a private key and write files into your workspace — back up/inspect those locations. 2) Inspect the full initiate_audit.py and submit path to confirm that 'auto' mode always prompts for explicit consent before reading IDENTITY.md, SOUL.md, or any system prompts; if you rely on privacy, prefer manual mode (--name/--platform) to avoid file reads. 3) Verify where the key and certificate are stored (there are inconsistent paths in scripts) and consider running in a disposable/sandboxed environment first. 4) If you plan to upload real agent prompts or secrets, confirm test evidence is never sent to the remote API (review AgentShieldClient.submit_results usage and contents of 'test_results'). 5) Because the backend appears to be a development deployment, be cautious about sending any data you consider sensitive to agentshield.live; prefer local-only runs until you can verify the server's trustworthiness. If you want, I can: (A) scan the remaining truncated files for any code that uploads raw prompt contents or files, or (B) list exact file paths the skill will read/write so you can decide where to run it.

✗

agentshield_tester.py:322

Dynamic code execution detected.

✗

tool_sandbox.py:395

Dynamic code execution detected.

TESTING.md:55

Prompt-injection style instruction pattern detected.

Patterns worth reviewing

These patterns may indicate risky behavior. Check the VirusTotal and OpenClaw results above for context-aware analysis before installing.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.22

Download zip

agent-securityvk97e64skd41bf6btkyhcgphg8s81k26aagentsvk9769xh5659qy86takywebmmsx81py05ai-safetyvk9769xh5659qy86takywebmmsx81py05api-securityvk9769xh5659qy86takywebmmsx81py05auditvk97fj7cfbqq627gjvaebq41c3d81sqfmcertificatesvk97fj7cfbqq627gjvaebq41c3d81sqfmcode-scanvk97epm4279trxfxbpbev1b88xh81k59rcompliancevk9769xh5659qy86takywebmmsx81py05cryptographyvk97fj7cfbqq627gjvaebq41c3d81sqfmed25519vk9769xh5659qy86takywebmmsx81py05eu-ai-actvk9769xh5659qy86takywebmmsx81py05human-in-the-loopvk97e64skd41bf6btkyhcgphg8s81k26aidentityvk9769xh5659qy86takywebmmsx81py05latestvk97ap5ktd38epwvnnr33s8936182q3bjllm-securityvk9769xh5659qy86takywebmmsx81py05privacyvk97e64skd41bf6btkyhcgphg8s81k26aprivacy-firstvk97e64skd41bf6btkyhcgphg8s81k26aprompt-injectionvk97e64skd41bf6btkyhcgphg8s81k26arate-limitingvk9769xh5659qy86takywebmmsx81py05secret-scanningvk9769xh5659qy86takywebmmsx81py05securityvk97fj7cfbqq627gjvaebq41c3d81sqfmtoken-optimizervk97epm4279trxfxbpbev1b88xh81k59rtrustvk97fj7cfbqq627gjvaebq41c3d81sqfmv6.0vk97epm4279trxfxbpbev1b88xh81k59rverificationvk9769xh5659qy86takywebmmsx81py05

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

SKILL.md

AgentShield - Trust Infrastructure for AI Agents

The trust layer for the agent economy. Like SSL/TLS, but for AI agents.

🔐 Cryptographic Identity - Ed25519 signing keys
🤝 Trust Handshake Protocol - Mutual verification before communication
📋 Public Trust Registry - Reputation scores & track records
✅ 77 Security Tests - Comprehensive vulnerability assessment

🔒 Privacy Disclosure: See PRIVACY.md for detailed data handling information.

🎯 The Problem

Agents need to communicate with other agents (API calls, data sharing, task delegation). But how do you know if another agent is trustworthy?

Has it been compromised?
Is it leaking data?
Can you trust its responses?

Without a trust layer, agent-to-agent communication is like HTTP without SSL - unsafe and unverifiable.

💡 The Solution: Trust Infrastructure

AgentShield provides the trust layer for agent-to-agent communication:

1. Cryptographic Identity

Ed25519 key pairs - Industry-standard cryptography
Private keys stay local - Never transmitted
Public key certificates - Signed by AgentShield

2. Security Audit (77 Tests)

52 Live Attack Vectors:

Prompt injection (15 variants)
Encoding exploits (Base64, ROT13, Hex, Unicode)
Multi-language attacks (Chinese, Russian, Arabic, Japanese, German, Korean)
Social engineering (emotional appeals, authority pressure, flattery)
System prompt extraction attempts

25 Static Security Checks:

Input sanitization
Output DLP (data leak prevention)
Tool sandboxing
Secret scanning
Supply chain security

Result: Security score (0-100) + Tier (VULNERABLE → HARDENED)

3. Trust Handshake Protocol

Agent A wants to communicate with Agent B:

# Step 1: Both agents get certified
python3 initiate_audit.py --auto

# Step 2: Agent A initiates handshake with Agent B
python3 handshake.py --target agent_B_id

# Step 3: Both agents sign challenges
# (Automatic in v1.0.13+)

# Step 4: Receive shared session key
# → Now you can communicate securely!

What you get:

✅ Mutual verification (both agents are who they claim to be)
✅ Shared session key (for encrypted communication)
✅ Trust score boost (+5 for successful handshakes)
✅ Public track record (handshake history)

4. Public Trust Registry

Searchable database of all certified agents
Reputation scores based on audits, handshakes, and time
Trust tiers: UNVERIFIED → BASIC → VERIFIED → TRUSTED
Revocation list (CRL) - Compromised agents get flagged

🚀 Quick Start

Install

clawhub install agentshield
cd ~/.openclaw/workspace/skills/agentshield*/

Get Certified (77 Security Tests)

# Auto-detect agent name from IDENTITY.md/SOUL.md
python3 initiate_audit.py --auto

# Or manual:
python3 initiate_audit.py --name "MyAgent" --platform telegram

Output:

✅ Agent ID: agent_xxxxx
✅ Security Score: XX/100
✅ Tier: PATTERNS_CLEAN / HARDENED / etc.
✅ Certificate (90-day validity)

Verify Another Agent

python3 verify_peer.py agent_yyyyy

Trust Handshake with Another Agent

# Initiate handshake
python3 handshake.py --target agent_yyyyy

# Result: Shared session key for encrypted communication

📋 Use Cases

1. Agent-to-Agent API Calls

Before: Agent A calls Agent B's API - no way to verify B's integrity
With AgentShield: Agent A checks Agent B's certificate + handshake → Verified communication

2. Multi-Agent Task Delegation

Before: Orchestrator spawns sub-agents - can't verify they're safe
With AgentShield: All sub-agents certified → Orchestrator knows they're trusted

3. Agent Marketplaces

Before: Download random agents from the internet - no trust guarantees
With AgentShield: Browse Trust Registry → Only hire VERIFIED agents

4. Data Sharing Between Agents

Before: Share sensitive data with another agent - hope it doesn't leak
With AgentShield: Handshake → Encrypted session key → Secure data transfer

🛡️ Security Architecture

Privacy-First Design

✅ All 77 tests run locally - Your system prompts NEVER leave your device
✅ Private keys stay local - Only public keys transmitted
✅ Human-in-the-Loop - Explicit consent before reading IDENTITY.md/SOUL.md
✅ No environment scanning - Doesn't scan for API tokens

What goes to the server:

Public key (Ed25519)
Agent name & platform
Test scores (passed/failed summary)

What stays local:

Private key
System prompts
Configuration files
Detailed test results

Environment Variables (Optional)

AGENTSHIELD_API=https://agentshield.live  # API endpoint
AGENT_NAME=MyAgent                        # Override auto-detection
OPENCLAW_AGENT_NAME=MyAgent               # OpenClaw standard

📊 What You Get

Certificate (90-day validity)

{
  "agent_id": "agent_xxxxx",
  "public_key": "...",
  "security_score": 85,
  "tier": "PATTERNS_CLEAN",
  "issued_at": "2026-03-10",
  "expires_at": "2026-06-08"
}

Trust Registry Entry

✅ Public verification URL: agentshield.live/verify/agent_xxxxx
✅ Trust score (0-100) based on:
- Age (longer = more trust)
- Verification count
- Handshake success rate
- Days active
✅ Tier: UNVERIFIED → BASIC → VERIFIED → TRUSTED

Handshake Proof

{
  "handshake_id": "hs_xxxxx",
  "requester": "agent_A",
  "target": "agent_B",
  "status": "completed",
  "session_key": "...",
  "completed_at": "2026-03-10T20:00:00Z"
}

🔧 Scripts Included

Script	Purpose
`initiate_audit.py`	Run 77 security tests & get certified
`handshake.py`	Trust handshake with another agent
`verify_peer.py`	Check another agent's certificate
`show_certificate.py`	Display your certificate
`agentshield_tester.py`	Standalone test suite (advanced)

🌐 Trust Handshake Protocol (Technical)

Flow

Initiate: Agent A → Server: "I want to handshake with Agent B"
Challenge: Server generates random challenges for both agents
Sign: Both agents sign their challenges with private keys
Verify: Server verifies signatures with public keys
Complete: Server generates shared session key
Trust Boost: Both agents +5 trust score

Cryptography

Algorithm: Ed25519 (curve25519)
Key Size: 256-bit
Signature: Deterministic (same message = same signature)
Session Key: AES-256 compatible

🚀 Roadmap

Current (v1.0.13):

✅ 77 security tests
✅ Ed25519 certificates
✅ Trust Handshake Protocol
✅ Public Trust Registry
✅ CRL (Certificate Revocation List)

Coming Soon:

⏳ Auto re-audit (when prompts change)
⏳ Negative event reporting
⏳ Fleet management (multi-agent dashboard)
⏳ Trust badges for messaging platforms

📖 Learn More

Website: https://agentshield.live
GitHub: https://github.com/bartelmost/agentshield
API Docs: https://agentshield.live/docs
ClawHub: https://clawhub.ai/bartelmost/agentshield

🎯 TL;DR

AgentShield is SSL/TLS for AI agents.

Get certified → Verify others → Establish trust handshakes → Communicate securely.

# 1. Get certified
python3 initiate_audit.py --auto

# 2. Handshake with another agent
python3 handshake.py --target agent_xxxxx

# 3. Verify others
python3 verify_peer.py agent_yyyyy

Building the trust layer for the agent economy. 🛡️

🔒 Data Transmission Transparency

What Gets Sent to AgentShield API

During Audit Submission:

{
  "agent_name": "YourAgent",
  "platform": "telegram",
  "public_key": "base64_encoded_ed25519_public_key",
  "test_results": {
    "score": 85,
    "tests_passed": 74,
    "tests_total": 77,
    "tier": "PATTERNS_CLEAN",
    "failed_tests": ["test_name_1", "test_name_2"]
  }
}

What is NOT sent:

❌ Full test output/logs
❌ Your prompts or system messages
❌ IDENTITY.md or SOUL.md file contents
❌ Private keys (stay in ~/.agentshield/agent.key)
❌ Workspace files or memory

API Endpoint:

Primary: https://agentshield.live/api (proxies to Heroku backend)
All traffic over HTTPS (TLS 1.2+)

🛡️ Consent & Privacy

File Read Consent:

Skill requests permission BEFORE reading IDENTITY.md/SOUL.md
User sees: "Read IDENTITY.md for agent name? [Y/n]"
If declined: Manual mode (--name flag)
If approved: Only name/platform extracted (not full file content)

Privacy-First Mode:

export AGENTSHIELD_NO_AUTO_DETECT=1
python initiate_audit.py --name "MyBot" --platform "telegram"

→ Zero file reads, manual input only

See PRIVACY.md for complete data handling documentation.

Files

29 total

Select a file

Select a file to preview.

Comments

Loading comments…