Sensitive Data Masker
v1.0.7Intelligent sensitive data detection and masking. Uses Microsoft Presidio + SQLite for automatic PII redaction with local restoration support.
Security Scan
OpenClaw
Suspicious
high confidencePurpose & Capability
The name/description match the included code: it uses Presidio for detection and SQLite + encryption for local mapping and restoration. However, the runtime requirements declared in metadata/SKILL.md do not include the cryptography package even though the Python code requires it and fails if it's not present. That omission is incoherent with the code's stated 'REQUIRED - no fallback' encryption behavior.
Instruction Scope
The handler launches the Python masker by putting the entire message content on the child process command line (spawn('python3', [MASKER_SCRIPT, 'mask', content])). Passing raw messages (potentially secrets) as argv exposes them to other local users via process listings (ps), which contradicts the skill's goal of protecting secrets. Aside from that, the instructions and code operate only on local storage and do not call external endpoints.
Install Mechanism
There is no automatic install spec; SKILL.md lists pip/spacy install commands for the user to run. That is low-risk. However, the code requires the cryptography module (and enforces encryption) but the declared install recommendations and metadata do not include it — an inconsistency that will cause the skill to fail or force manual installation.
Credentials
The skill requests no environment variables or external credentials (appropriate). It writes files under ~/.openclaw/data/sensitive-masker and generates an encryption key file; these are proportionate to local mapping/restoration. Note: storing both an encrypted DB and the encryption key locally means a compromise of the user account or backups will expose cleartext; the README warns about backups, but this is an expected tradeoff and should be considered by operators.
Persistence & Privilege
always:false and the skill registers a message:received hook (expected for this purpose). It writes its own files under the user's OpenClaw data directory and does not modify other skills or system-wide settings. No excessive platform privileges are requested.
What to consider before installing
This skill mostly does what it says (local PII detection + local mapping/restore), but two issues need your attention before installing: 1) The Python code requires the cryptography library (it will exit if missing) yet the SKILL.md and metadata do not include installing cryptography — make sure to pip install cryptography so the skill's encryption works as intended. 2) The hook implementation passes full message text as a command-line argument to the masker process; command-line arguments are visible to other users on the same machine (ps aux), which can leak secrets. Prefer changing the handler to pass sensitive content via stdin or another IPC mechanism, or ensure the host is multi-user-safe and that only trusted accounts exist. Also review file permissions and backup policies for ~/.openclaw/data/sensitive-masker (the mapping DB and the encryption key are stored locally and must be protected). If you cannot guarantee host-level protections or cannot enforce the code change to avoid argv exposure, treat this skill as risky and do not enable it on multi-tenant systems.Like a lobster shell, security has layers — review code before you run it.
Runtime requirements
🔐 Clawdis
Binspython3
auditedencryptedlatestmaskingmicrosoftpiipresidioprivacyredactionrequired-encryptionsecuresecurity
Sensitive Data Masker
Intelligent sensitive data detection and masking using Microsoft Presidio with SQLite + LRU cache storage.
Features
- ✅ Intelligent detection - Microsoft Presidio (NLP + rules)
- ✅ Fast storage - SQLite + LRU cache
- ✅ Local restoration - 7-day temporary mapping table
- ✅ Auto cleanup - Expired entries removed automatically
- ✅ 100% local - No external API required
- ✅ OpenClaw Hook - Automatic masking on message received
How It Works
User Message
↓
Channel Plugin (Feishu/Telegram/etc)
↓
OpenClaw Gateway (message:received)
↓
Sensitive Data Masker Hook ← Intercept here
↓
Presidio Detection (NLP + Rules)
↓
SQLite + Cache Store Mapping
↓
Masked Message
↓
Send to LLM API (Safe)
↓
Restore Before Task Execution
↓
Execute with Original Data
Detection Types
| Type | Examples | Masked As |
|---|---|---|
| PASSWORD | password=MySecret123 | [PASSWORD:xxx] |
| API_KEY | sk-abcdefghijklmnop | [API_KEY:xxx] |
| TOKEN | token=xyz123 | [TOKEN:xxx] |
| SECRET | secret=abc+/== | [SECRET:xxx] |
| PRIVATE_KEY | BEGIN RSA PRIVATE KEY | [PRIVATE_KEY:xxx] |
| DB_CONNECTION | mongodb://user:pass@host | [DB_CONNECTION:xxx] |
| EMAIL_ADDRESS | user@example.com | [EMAIL_ADDRESS:xxx] |
| PHONE_NUMBER | 13800138000 | [PHONE_NUMBER:xxx] |
| CREDIT_CARD | 4111111111111111 | [CREDIT_CARD:xxx] |
| PERSON | John Doe | [PERSON:xxx] |
| LOCATION | 123 Main St | [LOCATION:xxx] |
| URL | https://example.com | [URL:xxx] |
Installation
# Install dependencies
pip install presidio-analyzer presidio-anonymizer
python3 -m spacy download zh_core_web_sm
# Enable Hook
openclaw hooks enable sensitive-data-masker
# Verify
openclaw hooks check
Usage Examples
User sends:
My password is MySecret123, email is user@example.com
Masked (to API):
My password is [PASSWORD:f2ae1ea6], email is [EMAIL_ADDRESS:96770696]
Mapping stored (7 days):
{
"f2ae1ea6": "password=MySecret123",
"96770696": "user@example.com"
}
Local restoration (for task execution):
My password is MySecret123, email is user@example.com
Configuration
File: ~/.openclaw/data/sensitive-masker/config.json
{
"enabled": true,
"ttl_days": 7,
"cache_size": 1000,
"auto_cleanup": true,
"cleanup_interval_hours": 1,
"log_enabled": true,
"encrypt_storage": false,
"presidio": {
"language": "zh",
"entities": ["PHONE_NUMBER", "EMAIL_ADDRESS", ...],
"custom_patterns": true
}
}
Management Commands
# Test masking
python3 sensitive-masker.py test "my password=123"
# View statistics
python3 sensitive-masker.py stats
# Cleanup expired
python3 sensitive-masker.py cleanup
# Clear all mappings
python3 sensitive-masker.py clear
Performance
| Operation | Latency |
|---|---|
| Hot query (cache) | < 0.1ms |
| Cold query (SQLite) | ~0.5ms |
| Write | < 2ms |
| Max records | 100,000+ |
Cache hit rate: > 90% typical
Security Features
- ✅ File permissions: 600 (owner read/write only)
- ✅ SQLite transaction safety
- ✅ Auto-expiry cleanup
- ✅ LRU cache eviction
- ✅ Local storage only
- ✅ Optional encryption at rest
Architecture
Components
- PresidioDetector - Microsoft Presidio integration
- SensitiveMappingStore - SQLite + LRU cache
- ChannelSensitiveMasker - Main masking logic
- OpenClaw Hook - Gateway integration
Database Schema
CREATE TABLE mappings (
mask_id TEXT PRIMARY KEY,
original TEXT NOT NULL,
data_type TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP NOT NULL,
usage_count INTEGER DEFAULT 0
);
CREATE INDEX idx_expires_at ON mappings(expires_at);
CREATE INDEX idx_data_type ON mappings(data_type);
Files
sensitive-data-masker/
├── SKILL.md # This file (English)
├── SKILL.md # Chinese version
├── sensitive-masker.py # Core script
├── handler.js # OpenClaw Hook
├── masker-wrapper.py # Python wrapper
├── DESIGN.md # Design document
├── README.md # User guide
├── RESEARCH-EXISTING-SOLUTIONS.md # Market research
└── _meta.json # Metadata
Version History
v1.0.0 (2026-03-03)
- Initial release
- Microsoft Presidio integration
- SQLite + LRU cache storage
- OpenClaw Hook support
- 7-day TTL mapping table
- Auto cleanup
Repository
Source: https://gitee.com/subline/onepeace/tree/develop/src/skills/sensitive-data-masker
License: MIT
Author: TK
Issues: https://gitee.com/subline/onepeace/issues
Credits
- Microsoft Presidio - https://github.com/microsoft/presidio
- spaCy - https://spacy.io/
- OpenClaw - https://github.com/openclaw/openclaw
Related Skills
ssh-batch-manager- Batch SSH key managementhealthcheck- Security hardening and auditsskill-creator- Create new skills
Comments
Loading comments...
