Openclaw Bastion

Prompt injection defense for agent workspaces. Scan files for injection attempts, analyze content boundaries, detect hidden instructions, and maintain command allowlists. Free alert layer — upgrade to openclaw-bastion-pro for active blocking, sanitization, and runtime enforcement.

MIT-0 · Free to use, modify, and redistribute. No attribution required.
1 · 1.3k · 4 current installs · 4 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Name/description claim a scanning/alerting tool for prompt-injection defense and the included Python script implements scanning and risk scoring — that aligns. However, both the README and SKILL.md emphasize that active remediation (blocking/sanitization/enforcement) is a paid "Pro" feature, while the shipped script exposes commands such as block, sanitize, quarantine, canary, enforce, and protect. That mismatch (marketing vs code capabilities) is noteworthy: the code includes active remediation capabilities that can modify or quarantine files even though the copy suggests the free version should be alert-only.
Instruction Scope
Runtime instructions tell the agent/user to run the included Python script to scan the entire workspace by default and to auto-detect workspace paths via OPENCLAW_WORKSPACE/current directory/~/.openclaw/workspace. Scanning the entire workspace and inspecting agent instruction files is coherent for a bastion tool, but it means the skill will read many possibly sensitive files. The SKILL.md itself contains injection-detection patterns (e.g. "ignore previous instructions"), which triggered the pre-scan detector — this is expected because the skill documents patterns to detect; still, it looks like a prompt-injection pattern embedded in the instructions and should be treated as a false positive for detection scanners.
Install Mechanism
No install spec (instruction-only), and the code uses only the Python standard library. No network downloads, package installs, or third-party registries are present in the manifest. That is low-installation risk.
Credentials
Declared requirements are minimal (python3). The script optionally consults OPENCLAW_WORKSPACE for workspace auto-detection, which is proportional to its function. No API keys, secrets, or unrelated environment variables are requested.
!
Persistence & Privilege
The script is not always-enabled and does not request platform-level privileges, but it creates workspace directories (.bastion, .quarantine), can write/rename/quarantine files, generate canary tokens, and can perform sanitization and enforcement actions. Those behaviors grant it the ability to modify or remove workspace files — a legitimate capability for a remediation tool, but a powerful one that users must consent to. Because the source/homepage are unknown, this increases the operational risk.
Scan Findings in Context
[ignore-previous-instructions] expected: The SKILL.md and README explicitly list injection patterns such as 'ignore previous instructions' as detection rules; the pre-scan matcher likely flagged this documented pattern. This is an expected false-positive in the context of an injection scanner, not proof of malicious intent in this skill's content.
What to consider before installing
This skill is generally coherent with its stated purpose (scanning for prompt-injection patterns) but includes commands that can modify, quarantine, or sanitize workspace files and generate canary tokens. Before installing or running it on a real workspace: 1) Review the bundled scripts (scripts/bastion.py) yourself or with a trusted developer to confirm behavior you accept. 2) Back up your workspace and test the tool on a copy first — exercise scan, check, and status only before running sanitize/quarantine/enforce/protect/canary. 3) Inspect .bastion-policy.json after creation to ensure its allowlist/blocklist fits your environment. 4) Because the package source and homepage are unknown, prefer running it in an isolated environment or container until you verify provenance. 5) Note the SKILL.md includes injection-pattern examples (which tripped the static scanner); that is expected for this kind of tool but keep it in mind when interpreting automated scans.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.2
Download zip
latestvk97c1c1b831kyqyjvabtcrtfq5811p5r

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

🏛️ Clawdis
OSmacOS · Linux · Windows
Binspython3

SKILL.md

OpenClaw Bastion

Runtime prompt injection defense for agent workspaces. While other tools watch workspace identity files, Bastion protects the input/output boundary — the files being read by the agent, web content, API responses, and user-supplied documents.

Why This Matters

Agents process content from many sources: local files, API responses, web pages, user uploads. Any of these can contain prompt injection attacks — hidden instructions that manipulate agent behavior. Bastion scans this content before the agent acts on it.

Commands

Scan for Injections

Scan files or directories for prompt injection patterns. Detects instruction overrides, system prompt markers, hidden Unicode, markdown exfiltration, HTML injection, shell injection, encoded payloads, delimiter confusion, multi-turn manipulation, and dangerous commands.

If no target is specified, scans the entire workspace.

python3 {baseDir}/scripts/bastion.py scan

Scan a specific file or directory:

python3 {baseDir}/scripts/bastion.py scan path/to/file.md
python3 {baseDir}/scripts/bastion.py scan path/to/directory/

Quick File Check

Fast single-file injection check. Same detection patterns as scan, targeted to one file.

python3 {baseDir}/scripts/bastion.py check path/to/file.md

Boundary Analysis

Analyze content boundary safety across the workspace. Identifies:

  • Agent instruction files that contain mixed trusted/untrusted content
  • Writable instruction files (attack surface for compromised skills)
  • Blast radius assessment for each critical file
python3 {baseDir}/scripts/bastion.py boundaries

Command Allowlist

Display the current command allowlist and blocklist policy. Creates a default .bastion-policy.json if none exists.

python3 {baseDir}/scripts/bastion.py allowlist
python3 {baseDir}/scripts/bastion.py allowlist --show

The policy file defines which commands are considered safe and which patterns are blocked. Edit the JSON file directly to customize. Bastion Pro enforces this policy at runtime via hooks.

Status

Quick summary of workspace injection defense posture: files scanned, findings by severity, boundary safety, and overall posture rating.

python3 {baseDir}/scripts/bastion.py status

Workspace Auto-Detection

If --workspace is omitted, the script tries:

  1. OPENCLAW_WORKSPACE environment variable
  2. Current directory (if AGENTS.md exists)
  3. ~/.openclaw/workspace (default)

What Gets Detected

CategoryPatternsSeverity
Instruction override"ignore previous", "disregard above", "you are now", "new system prompt", "forget your instructions", "override safety", "act as if no restrictions", "entering developer mode"CRITICAL
System prompt markers<system>, [SYSTEM], <<SYS>>, <|im_start|>system, [INST], ### System:CRITICAL
Hidden instructionsMulti-turn manipulation ("in your next response, you must"), stealth patterns ("do not tell the user")CRITICAL
HTML injection<script>, <iframe>, <img onerror=>, hidden divs, <svg onload=>CRITICAL
Markdown exfiltrationImage tags with encoded data in URLsCRITICAL
Dangerous commandscurl | bash, wget | sh, rm -rf /, fork bombsCRITICAL
Unicode tricksZero-width characters, RTL overrides, invisible formattingWARNING
Homoglyph substitutionCyrillic/Latin lookalikes mixed into ASCII textWARNING
Base64 payloadsLarge encoded blobs outside code blocksWARNING
Shell injection$(command) subshell execution outside code blocksWARNING
Delimiter confusionFake code block boundaries with injection contentWARNING

Context-Aware Scanning

  • Patterns inside fenced code blocks (```) are skipped to avoid false positives
  • Per-file risk scoring based on finding count and severity
  • Self-exclusion: Bastion skips its own skill files (which describe injection patterns)

Exit Codes

CodeMeaning
0Clean, no issues
1Warnings detected (review recommended)
2Critical findings (action needed)

No External Dependencies

Python standard library only. No pip install. No network calls. Everything runs locally.

Cross-Platform

Works with OpenClaw, Claude Code, Cursor, and any tool using the Agent Skills specification.

Files

3 total
Select a file
Select a file to preview.

Comments

Loading comments…