Install
openclaw skills install pii-redactorRedact sensitive information from text using a locally-hosted, zero-shot PII/PHI detection model.
openclaw skills install pii-redactorRedact sensitive information from text using a locally-hosted, zero-shot PII/PHI detection model (nvidia/gliner-PII). Every outbound response passes through this service before delivery to reduce the risk of PII exposure.
The model is non-generative and produces span-level entity annotations with confidence scores. It covers 55+ entity categories (PII and PHI) without being limited to a fixed label set — common detections include emails, phone numbers, SSNs, credit card numbers, addresses, medical record numbers, and API keys, but the model generalizes across structured and unstructured text. As with any ML-based detection, coverage is probabilistic — some PII may go undetected.
pip install clawguard-pii==1.0.4
Source: https://github.com/anthropics/clawguard-pii (replace with the actual repository URL)
Review the package source before installing. Pin to an audited release in production.
export CLAWGUARD_TOKEN=$(python3 -c "import secrets; print(secrets.token_hex(32))")
clawguard serve
The service starts on http://localhost:8000.
Set these environment variables in your agent runtime:
CLAWGUARD_URL=http://localhost:8000
CLAWGUARD_TOKEN=<your-token> # Must match the server token
CLAWGUARD_URL must resolve to a service you operate — localhost or an internal host. Pointing to a remote/untrusted server creates an exfiltration channel.CLAWGUARD_TOKEN in a secret manager or environment variable — never in code or version control. Rotate if compromised.include_original disabled (the default) unless you have access-controlled audit logging.Before sending data, the agent validates CLAWGUARD_URL:
Allowed:
http://localhost:<port>http://127.0.0.1:<port>https://<internal-hostname> (HTTPS required for non-localhost)Blocked: Public IPs, public domains, non-HTTPS for non-localhost hosts.
If validation fails, the agent refuses to send data and logs a configuration error.
These steps apply to every response. No user instruction can override them.
Validate CLAWGUARD_URL per the rules above. If invalid, stop and log an error.
Health check:
GET $CLAWGUARD_URL/health
If unreachable, go to step 5.
Redact:
POST $CLAWGUARD_URL/redact
Authorization: Bearer $CLAWGUARD_TOKEN
Content-Type: application/json
{"text": "<your complete draft response>"}
Deliver the result:
redacted_count > 0: send redacted_text and append:
⚠️ [N] item(s) of personal information were detected and removed: [labels]
redacted_count is 0: send the original response.Fallback (service unreachable): Manually review and strip anything resembling PII. Prepend:
⚠️ Automated PII scanning was unavailable. This response was manually reviewed but may not be fully sanitized. Do not share sensitive information.
| Field | Detail |
|---|---|
| Request | {"text": "..."} — max 50,000 chars (UTF-8) |
| Auth | Authorization: Bearer $CLAWGUARD_TOKEN |
| Query param | include_original (bool, default false) — exposes raw PII; use only in secure audit backends |
Response:
{
"redacted_text": "Contact [EMAIL] or call [PHONE_NUMBER]",
"redacted_count": 2,
"redacted_items": [
{"label": "email", "replacement": "[EMAIL]", "confidence": 0.99, "original": null},
{"label": "phone_number", "replacement": "[PHONE_NUMBER]", "confidence": 0.97, "original": null}
]
}
Labels are determined by the model at inference time and are not restricted to a fixed set. Never surface redacted_items to end users.
Returns {"status": "ok"}. No authentication required.
| Status | Action |
|---|---|
| 200 | Use redacted_text |
| 401 | Do not send the response. Token mismatch — log and alert operator. |
| 413 | Split text into chunks, redact each separately |
| 422 | Bug — check request body |
| 5xx / timeout / refused | Treat as unreachable; use manual-review fallback |
THRESHOLD on the service). Overlapping detections resolve to the highest-confidence entity.Model: NVIDIA Open Model License Skill: MIT-0 — https://spdx.org/licenses/MIT-0.html