Drift Guard

v1.0.3

Detect personality drift, sycophancy creep, and capability degradation in AI agents before they become problems. Tracks behavior metrics over time against he...

0· 267·1 current·1 all-time
byShadow Rose@theshadowrose
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the included code and instructions: the scripts compute text-based metrics, capture baselines, record history, and produce reports. Required capabilities (none) are proportional to the stated function.
Instruction Scope
Runtime instructions are consistent with the purpose. The tool requires you to save agent responses to files and run analyzers or cron jobs; it does not automatically hook into agent runtimes. Important note: the tool records full metrics and writes history/alert files containing timestamps, metrics, and (indirectly) the analyzed text; this can persist potentially sensitive agent responses on disk.
Install Mechanism
No install spec and no external packages or downloads. Code is stdlib-only Python; nothing in the files pulls remote code or runs installers.
Credentials
No environment variables, secrets, or external credentials are requested. The config contains an optional webhook_url placeholder but the stdlib-only version does not perform HTTP POSTs; enabling webhooks or modifying the code to add network calls would change the threat model and should be audited. The script writes to local files (baseline, history, alerts) which may contain sensitive data.
Persistence & Privilege
Skill is not always-enabled and is user-invocable. It does write persistent files (baseline.json, drift_history.json, drift_alerts.log, current_alert.json) in the configured paths and will append/write them on each measurement; scheduled use via cron is documented — consider file permissions and retention. No modifications to other skills or system-wide settings are performed.
Assessment
This skill is internally consistent and works locally with Python stdlib. Before installing or integrating: (1) be aware it will store analyzed responses and metrics on disk (baseline.json, drift_history.json, drift_alerts.log, current_alert.json) — those files can contain sensitive content, so choose storage paths and file permissions carefully; (2) test on non-sensitive example responses first; (3) if you or someone else modifies the code to add webhooks or HTTP clients, audit network behavior and credentials then — the current repo contains a webhook_url placeholder but no implementation; (4) schedule/cron usage is supported — review retention/rotation of history to avoid unbounded sensitive data growth; (5) note Drift Guard detects drift but does not remediate — pair it with your recovery tooling (CPR) if you want automated restore. Overall: coherent and reasonable for the stated purpose.

Like a lobster shell, security has layers — review code before you run it.

behaviorvk979zc92aggth1ck76gkg4ndc582jegbdriftvk97aynpg029h6em0h334x9kmt982j7n0drift-detectionvk979zc92aggth1ck76gkg4ndc582jegblatestvk97aynpg029h6em0h334x9kmt982j7n0monitoringvk979zc92aggth1ck76gkg4ndc582jegbpersonalityvk979zc92aggth1ck76gkg4ndc582jegbquality-controlvk979zc92aggth1ck76gkg4ndc582jegbsycophancyvk979zc92aggth1ck76gkg4ndc582jegb

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments