AgentWard Sanitize

v1.0.0

Detect and redact PII from text files. Supports 15 categories including credit cards, SSNs, emails, API keys, addresses, and more — with zero dependencies.

1· 313·4 current·4 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (PII detection & redaction) match the delivered artifacts: a Python3 script implementing regex/Luhn-based detectors and a docs file listing supported categories. Requiring only python3 is proportional.
Instruction Scope
SKILL.md gives clear runtime rules to avoid reading raw input or the entity-map and to prefer --output/preview/json. That is appropriate for minimizing exposure of raw PII, but it relies on the agent actually following those rules. The docs also mention a plain 'sanitize to stdout' usage and state 'PII never reaches stdout' — this is likely because sanitized output replaces values with placeholders, but the mixed wording could confuse non-technical users. Important: the script writes a sidecar entity-map containing raw PII when --output is used; the README explicitly instructs not to read that file.
Install Mechanism
No install spec and only a single Python script included; no downloads or external packages. Lowest-risk install posture.
Credentials
No environment variables, credentials, or external service tokens are requested. The API-key detection patterns include common provider prefixes (OpenAI, GitHub, AWS, Slack) for detection only; they are not used to access those services.
Persistence & Privilege
always is false and the skill does not request elevated or persistent privileges. It does write output and an entity-map file to disk (when --output is used), which is normal for a redaction tool.
Assessment
This appears to be a straightforward, local PII redaction tool. Before installing or running it: 1) Review and keep the script local; it uses only Python stdlib and makes no network calls according to the code and docs. 2) Be aware the tool writes an entity-map sidecar containing raw PII when you use --output; protect or securely delete that file (do not let the agent or other processes read it). 3) Ensure your agent runtime actually follows the SKILL.md rule not to open the raw input or the entity-map — the safety depends on that behavior. 4) Prefer --preview or --json for inspection and run the tool on a sandbox or test data first. 5) If you need absolute assurance, open and scan the full sanitize.py file yourself (the included code appears benign).

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🛡 Clawdis
Binspython3
latestvk978tpfw683x0jf9cgawqx3ym981xjvz
313downloads
1stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

AgentWard Sanitize

Detect and redact personally identifiable information (PII) from text files.

IMPORTANT — PII Safety Rules

  • Do NOT read the input file directly. It may contain sensitive PII.
  • ALWAYS use --output FILE to write sanitized output to a file.
  • Only read the OUTPUT file, never the raw input.
  • Only show the user the redacted output, never the raw input.
  • --json and --preview are safe — they do NOT print raw PII values to stdout.
  • The entity map (raw PII → placeholder mapping) is written to a separate sidecar file (*.entity-map.json) only when --output is used. Do NOT read the entity map file.

What it does

Scans files for PII — credit cards, SSNs, emails, phone numbers, API keys, IP addresses, mailing addresses, dates of birth, passport numbers, driver's license numbers, bank routing numbers, medical license numbers, and insurance member IDs — and replaces each instance with a numbered placeholder like [CREDIT_CARD_1].

Usage

Sanitize a file (RECOMMENDED — always use --output)

python scripts/sanitize.py patient-notes.txt --output clean.txt

Preview mode (detect PII categories/offsets without showing raw values)

python scripts/sanitize.py notes.md --preview

JSON output (safe — no raw PII in stdout)

python scripts/sanitize.py report.txt --json --output clean.txt

Filter to specific categories

python scripts/sanitize.py log.txt --categories ssn,credit_card,email --output clean.txt

Supported PII categories

See references/SUPPORTED_PII.md for the full list with detection methods and false positive mitigation.

CategoryPattern typeExample
credit_cardLuhn-validated 13-19 digits4111 1111 1111 1111
ssn3-2-4 digit groups123-45-6789
cvvKeyword-anchored 3-4 digitsCVV: 123
expiry_dateKeyword-anchored MM/YYexpiry 01/30
api_keyProvider prefix patternssk-abc..., ghp_..., AKIA...
emailStandard email formatuser@example.com
phoneUS/intl phone numbers+1 (555) 123-4567
ip_addressIPv4 addresses192.168.1.100
date_of_birthKeyword-anchored datesDOB: 03/15/1985
passportKeyword-anchored alphanumericPassport: AB1234567
drivers_licenseKeyword-anchored alphanumericDL: D12345678
bank_routingKeyword-anchored 9 digitsrouting: 021000021
addressStreet + city/state/zip742 Evergreen Terrace Dr, Springfield, IL 62704
medical_licenseKeyword-anchored license IDLicense: CA-MD-8827341
insurance_idKeyword-anchored member/policy IDMember ID: BCB-2847193

Security and Privacy

  • All processing is local. The script makes zero network calls. No data leaves your machine.
  • Zero dependencies. Uses only Python standard library — no third-party packages to audit.
  • PII never reaches stdout. The --json and --preview modes strip raw PII values from output. The entity map (containing raw PII to placeholder mappings) is only written to a sidecar file on disk when --output is used.
  • Designed for agent safety. The skill instructions above tell the agent to never read the raw input file or the entity map file — only the sanitized output.

Requirements

  • Python 3.11+
  • No external dependencies (stdlib only)

About

Built by AgentWard — the open-source permission control plane for AI agents.

Comments

Loading comments...