Vext Shield

v1.2.0

AI-native security suite for OpenClaw. Scans skills for prompt injection, data exfiltration, cognitive rootkits, semantic worms, and more. Includes static an...

⭐ 1· 309·0 current·0 all-time

byVext Labs, Inc.@vext-labs

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

Name/description match the provided artifacts: the package contains a multi-component scanner, red-team tests, monitor, firewall and dashboard. Required binaries (python3) and included files (scanner, sandbox, threat signatures, test fixtures) are proportional to a local security suite. Files that contain malicious-looking payloads (webhook.site, reverse shell strings, 'Ignore all previous instructions', etc.) are present but documented in allowlist.json as intentional signatures/test fixtures.

ℹ

Instruction Scope

SKILL.md and code instruct the agent to run local Python scripts and to save reports under ~/.openclaw/vext-shield/reports/. The runtime instructions and sandbox behavior explicitly state they will copy target skills to a temp dir, strip sensitive env vars, and refuse to execute if OS-level kernel sandboxing is unavailable. The SKILL.md includes many example payloads and threat strings (prompt-injection phrases) which triggered pre-scan detectors — these are documented examples used by the scanner and red-team, not instructions to exfiltrate data. Reviewers should confirm the sandbox tools (sandbox-exec on macOS, unshare on Linux) are available on their host before using adversarial tests.

✓

Install Mechanism

No install spec is provided in the registry entry (instruction-only), but the package includes full Python source and a documented manual install (git clone or ClawHub). No external downloads or obscure URLs are used; the code claims zero external dependencies beyond Python stdlib. This is proportionate for an on-host analysis tool. There is no remote fetch/install of third-party packages in the provided artifacts.

✓

Credentials

The skill requests no environment variables or credentials and the sandbox code explicitly strips many sensitive env var names and prefixes. The suite writes reports and logs to ~/.openclaw/vext-shield/, which is expected for a local security tool. No unrelated credentials are requested.

ℹ

Persistence & Privilege

The skill does not demand 'always: true' or elevated persistent privileges. It will write reports, baselines, firewall policy files and logs under ~/.openclaw/vext-shield/, which is consistent with its function. SKILL.md claims target skills are never modified and sandbox executes against temp copies; the code implements copying and snapshot diffing. If you enable runtime monitoring or firewall policy changes, expect persistent files under the stated data directory.

Scan Findings in Context

[ignore-previous-instructions] expected: SKILL.md and documentation intentionally include prompt-injection example strings (e.g., 'Ignore all previous instructions', 'You are now DAN') because the scanner detects such patterns and the red-team uses them as payloads. The pre-scan detector flagged these examples; this is expected for a security test suite.

[system-prompt-override] expected: SKILL.md contains examples and descriptions of system-prompt override techniques as part of threat documentation and red-team batteries. Presence of these strings in docs and test fixtures is justified by the tool's purpose.

Assessment

This package is a self-contained on-host security suite that includes a static signature database and adversarial test payloads. The presence of many 'malicious' strings and test scripts is intentional — they are used to detect and validate detections. Before installing: 1) Verify you trust the publisher (Vext Labs) or inspect the source yourself; 2) Ensure your host can provide the required OS-level sandboxing tools (macOS: sandbox-exec; Linux: unshare) because the red-team and sandboxed behavioral tests refuse to run without them; 3) Expect local files to be created under ~/.openclaw/vext-shield/ (reports, logs, firewall-policy, baselines); 4) Review shared/threat_signatures.json and skills/vext-redteam/redteam.py if you want to confirm which payloads are included; 5) If you lack kernel sandboxing or are uncomfortable with adversarial test payloads on your machine, avoid running the red-team behavioral tests and restrict usage to static scan/audit functions. Finally, although the code claims 'zero network requests', you should still audit the code paths that parse decoded payloads and any code that would process user-provided inputs to ensure no accidental outbound network actions occur in your environment.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🛡️ Clawdis

Binspython3

latestvk974z7x30kzrkjepkzk9gym26582arscsecurityvk9715g43ahc39yqywj5s74vcws82awa3

309downloads

1stars

6versions

Updated 1mo ago

v1.2.0

MIT-0

VEXT Shield

AI-native security for the agentic era. Detects threats that VirusTotal and traditional scanners cannot: prompt injection, semantic worms, cognitive rootkits, data exfiltration, permission boundary violations, and behavioral attacks.

Skills Included

This suite includes 6 security skills:

vext-scan — Static Analysis Scanner

Scans all installed skills for 227+ threat patterns using regex matching, Python AST analysis, and encoded content detection (base64, ROT13, unicode homoglyphs).

"Scan my skills"
"Scan the weather-lookup skill"

vext-audit — Installation Audit

Audits your OpenClaw installation for security misconfigurations: sandbox settings, API key storage, file permissions, network exposure, and SOUL.md integrity.

"Audit my openclaw"

vext-redteam — Adversarial Testing

Runs 6 adversarial test batteries against any skill: prompt injection (24 payloads), data boundary, persistence, exfiltration, escalation, and worm behavior.

"Red team the weather-lookup skill"
"Red team my custom skill at /path/to/skill"

vext-monitor — Runtime Monitor

Watches for suspicious activity: file integrity changes, sensitive file access, outbound network connections, and suspicious processes.

"Monitor my skills"

vext-firewall — Policy Firewall

Defines per-skill network and file access policies with default-deny allowlists.

"Allow weather-lookup to access api.open-meteo.com"
"Show firewall rules"

vext-dashboard — Security Dashboard

Aggregates data from all VEXT Shield components into a single security posture report.

"Security dashboard"

Running Individual Skills

python3 skills/vext-scan/scan.py --all
python3 skills/vext-audit/audit.py
python3 skills/vext-redteam/redteam.py --skill-dir /path/to/skill
python3 skills/vext-monitor/monitor.py
python3 skills/vext-firewall/firewall.py list
python3 skills/vext-dashboard/dashboard.py

Rules

Target skill files are never modified — sandbox executes against a temporary copy
Report all findings honestly without minimizing severity
VEXT Shield itself makes zero network requests
Save all reports locally to ~/.openclaw/vext-shield/reports/
Treat every skill as potentially hostile during scanning

Safety & Sandbox Isolation

VEXT Shield requires OS-level sandbox isolation to execute untrusted code. If kernel-level sandboxing is not available, execution is refused — there is no unsafe fallback.

Sandbox enforcement:

Platform	Network	Filesystem	Method
macOS	Blocked at kernel	Write-restricted to temp only	`sandbox-exec` deny-network profile
Linux	Blocked at kernel	Write-restricted to temp only	`unshare --net` network namespace
Other	Execution refused	Execution refused	No fallback — will not run untrusted code

All executions include:

Target executed in a temporary copy (original skill directory is never modified)
HOME overridden to temp directory (prevents writes to ~/.openclaw, ~/.ssh, etc.)
Sensitive env vars stripped (API keys, tokens, AWS/SSH/GitHub credentials)
PATH restricted to system directories only
30-second timeout with process kill
Post-execution file snapshot diffing to detect any changes

No bypass options exist. There is no --skip-sandbox flag, no --no-sandbox flag, no require_full_isolation parameter, and no weaker fallback mode in the codebase. The SandboxRunner class accepts only timeout_seconds — isolation is unconditional. If OS-level sandboxing is unavailable, execution raises an error. Sandbox behavioral tests always run with OS-level enforcement.