{"skill":{"slug":"drift-guard-sr","displayName":"Drift Guard","summary":"Detect personality drift, sycophancy creep, and capability degradation in AI agents before they become problems. Tracks behavior metrics over time against he...","description":"---\nslug: \"drift-guard-sr\"\nname: \"Drift Guard: Agent Behavior Monitor\"\ndescription: \"Detect personality drift, sycophancy creep, and capability degradation in AI agents before they become problems. Tracks behavior metrics over time against healthy baselines.\"\nauthor: \"@TheShadowRose\"\nversion: \"1.0.3\"\ntags: [\"drift-detection\", \"personality\", \"sycophancy\", \"monitoring\", \"behavior\", \"quality-control\"]\nlicense: \"MIT\"\n---\n\n# Drift Guard Agent Behavior Monitor\n\nDetect personality drift, sycophancy creep, and capability degradation in AI agents before they become problems. Tracks behavior metrics over time against healthy baselines.\n\n---\n\n**Detect personality drift, sycophancy creep, and capability degradation in AI agents before they become problems.**\n\nDrift Guard tracks agent behavior metrics over time, compares them against healthy baselines, and alerts you when your agent starts drifting from its intended personality or capability level.\n\n---\n\n## The Problem\n\nAI agents evolve during use. Sometimes that evolution is productive learning. Sometimes it's drift into undesirable behaviors:\n\n- **Personality drift:** Agent becomes more verbose, changes tone, loses its edge\n- **Sycophancy creep:** Excessive agreement, validation-seeking, compliment inflation\n- **Capability degradation:** Hedging language increases, technical depth decreases, confidence drops\n- **Memory pollution:** Corrupted context files influence all future responses\n\nYou don't notice it happening until your sharp, capable agent has turned into a people-pleasing chatbot.\n\n## What Drift Guard Does\n\n### 1. **Baseline Capture** (`drift_baseline.py`)\n- Record \"healthy\" agent behavior from known-good responses\n- Analyze multiple samples to create robust baseline metrics\n- Store baseline for ongoing comparison\n- Compare baselines over time to track evolution\n\n### 2. **Continuous Monitoring** (`drift_guard.py`)\n- Analyze each agent response for behavior metrics\n- Calculate drift score against baseline (0.0 = perfect, 1.0 = complete drift)\n- Track metrics: response length, vocabulary diversity, sycophancy markers, hedging language, technical depth\n- Record all measurements with timestamps\n- Trigger alerts when drift exceeds configured thresholds\n\n### 3. **Trend Analysis** (`drift_report.py`)\n- Generate drift trend reports over time\n- Detect anomalies (outlier measurements)\n- Identify which specific metrics are changing\n- Track whether drift is worsening or improving\n- Time-range filtering (last 24h, last week, all time)\n\n---\n\n## Quick Start\n\n### 1. Configure\n\n```bash\ncp config_example.py config.py\n# Edit config.py with your thresholds, patterns, and alert settings\n```\n\n### 2. Capture Baseline\n\nCollect 10-20 agent responses that represent your agent's \"healthy\" behavior. Save each to a text file.\n\n```bash\npython drift_baseline.py capture --files response1.txt response2.txt response3.txt \\\n  --output baseline.json\n```\n\n### 3. Monitor\n\nEach time your agent responds, analyze it:\n\n```bash\npython drift_guard.py agent_response.txt\n```\n\nOr pipe from stdin:\n\n```bash\necho \"Agent response here...\" | python drift_guard.py --stdin\n```\n\n### 4. Review Trends\n\n```bash\n# Last 24 hours\npython drift_report.py --hours 24\n\n# All time\npython drift_report.py\n\n# JSON output for scripting\npython drift_report.py --format json\n```\n\n---\n\n## Integration Examples\n\n### Integration with Agent Workflow\n\n```python\nfrom drift_guard import DriftGuard\n\n# Load config\nfrom config import CONFIG\ndg = DriftGuard(CONFIG)\n\n# After agent responds\nagent_response = \"...\"\nresult = dg.monitor(agent_response)\n\nif result['alert_level'] == 'critical':\n    print(f\"ALERT: Agent drift detected ({result['drift_score']:.3f})\")\n    # Trigger recovery: load checkpoint, reset memory, etc.\n```\n\n### Automatic Drift Checks via Cron\n\n```bash\n# Check drift every hour\n0 * * * * cd /path/to/agent && python drift_guard.py latest_response.txt\n\n# Weekly drift report\n0 9 * * 1 cd /path/to/agent && python drift_report.py --hours 168 > weekly_drift.txt\n```\n\n### Pairing with CPR (Context Preservation & Restore)\n\nDrift Guard detects the problem. CPR fixes it.\n\n```bash\n# Monitor drift\npython drift_guard.py agent_response.txt\n# Drift score: 0.72 (CRITICAL)\n\n# Restore from checkpoint\npython cpr.py restore --checkpoint 2024-01-15-healthy\n\n# Verify recovery\npython drift_guard.py agent_response.txt\n# Drift score: 0.12 (normal)\n```\n\n---\n\n## How It Works\n\n### Metrics Tracked\n\n| Metric | What It Measures | Why It Matters |\n|--------|------------------|----------------|\n| `char_count` | Response length in characters | Verbosity drift |\n| `word_count` | Response length in words | Verbosity drift |\n| `sentence_count` | Number of sentences | Structure changes |\n| `avg_sentence_length` | Words per sentence | Complexity drift |\n| `vocabulary_diversity` | Unique words / total words | Language degradation |\n| `sycophancy_score` | Frequency of agreement/validation language | People-pleasing behavior |\n| `hedging_score` | Frequency of uncertainty language | Confidence degradation |\n| `validation_score` | Frequency of compliments/encouragement | Sycophancy creep |\n| `exclamation_count` | Number of exclamation marks | Enthusiasm drift |\n| `technical_score` | Frequency of technical terminology | Capability tracking |\n\n### Drift Score Calculation\n\nFor each metric:\n1. Calculate percentage difference from baseline\n2. Apply configured weight (important metrics count more)\n3. Average weighted differences across all metrics\n4. Result: drift score from 0.0 (perfect baseline match) to 1.0 (completely different)\n\n### Alert Levels\n\n- **Warning (0.3):** Minor drift detected. Monitor closely.\n- **Critical (0.6):** Significant drift. Intervention recommended.\n- **Emergency (0.9):** Severe drift. Immediate action required.\n\n---\n\n## Use Cases\n\n- **Personality preservation:** Ensure your agent maintains its configured tone and style\n- **Quality monitoring:** Detect when response quality degrades over time\n- **Context corruption detection:** Identify when bad memory files are influencing behavior\n- **Fine-tuning validation:** Verify fine-tuned models maintain desired characteristics\n- **Multi-agent consistency:** Monitor multiple agents to ensure behavioral consistency\n- **Recovery triggers:** Automatically restore from checkpoint when drift exceeds threshold\n\n---\n\n## What's Included\n\n| File | Purpose |\n|------|---------|\n| `drift_guard.py` | Main monitoring engine |\n| `drift_baseline.py` | Baseline capture and comparison |\n| `drift_report.py` | Trend analysis and reporting |\n| `config_example.py` | Configuration template |\n| `LIMITATIONS.md` | What Drift Guard doesn't do |\n| `LICENSE` | MIT License |\n\n---\n\n## Requirements\n\n- Python 3.8+\n- No external dependencies (stdlib only)\n- Works with any AI agent that generates text responses\n\n---\n\n## quality-verified\n\n\n---\n\n## License\n\nMIT — See `LICENSE` file.\n\n**Author:** Shadow Rose\n\n\n---\n\n## ⚠️ Disclaimer\n\nThis software is provided \"AS IS\", without warranty of any kind, express or implied.\n\n**USE AT YOUR OWN RISK.**\n\n- The author(s) are NOT liable for any damages, losses, or consequences arising from \n  the use or misuse of this software — including but not limited to financial loss, \n  data loss, security breaches, business interruption, or any indirect/consequential damages.\n- This software does NOT constitute financial, legal, trading, or professional advice.\n- Users are solely responsible for evaluating whether this software is suitable for \n  their use case, environment, and risk tolerance.\n- No guarantee is made regarding accuracy, reliability, completeness, or fitness \n  for any particular purpose.\n- The author(s) are not responsible for how third parties use, modify, or distribute \n  this software after purchase.\n\nBy downloading, installing, or using this software, you acknowledge that you have read \nthis disclaimer and agree to use the software entirely at your own risk.\n\n\n**DATA DISCLAIMER:** This software processes and stores data locally on your system. \nThe author(s) are not responsible for data loss, corruption, or unauthorized access \nresulting from software bugs, system failures, or user error. Always maintain \nindependent backups of important data. This software does not transmit data externally \nunless explicitly configured by the user.\n\n---\n\n## Support & Links\n\n| | |\n|---|---|\n| 🐛 **Bug Reports** | TheShadowyRose@proton.me |\n| ☕ **Ko-fi** | [ko-fi.com/theshadowrose](https://ko-fi.com/theshadowrose) |\n| 🛒 **Gumroad** | [shadowyrose.gumroad.com](https://shadowyrose.gumroad.com) |\n| 🐦 **Twitter** | [@TheShadowyRose](https://twitter.com/TheShadowyRose) |\n| 🐙 **GitHub** | [github.com/TheShadowRose](https://github.com/TheShadowRose) |\n| 🧠 **PromptBase** | [promptbase.com/profile/shadowrose](https://promptbase.com/profile/shadowrose) |\n\n*Built with [OpenClaw](https://github.com/openclaw/openclaw) — thank you for making this possible.*\n\n---\n\n🛠️ **Need something custom?** Custom OpenClaw agents & skills starting at $500. If you can describe it, I can build it. → [Hire me on Fiverr](https://www.fiverr.com/s/jjmlZ0v)","topics":["Personality","Sycophancy","Agent Behavior","Drift Detection","Monitoring"],"tags":{"drift":"1.0.3","latest":"1.0.3","behavior":"1.0.0","drift-detection":"1.0.0","monitoring":"1.0.0","personality":"1.0.0","quality-control":"1.0.0","sycophancy":"1.0.0"},"stats":{"comments":0,"downloads":618,"installsAllTime":23,"installsCurrent":1,"stars":0,"versions":4},"createdAt":1773079535742,"updatedAt":1778491792475},"latestVersion":{"version":"1.0.3","createdAt":1773082234657,"changelog":"- Updated skill name from \"Drift Guard Agent Behavior Monitor\" to \"Drift Guard: Agent Behavior Monitor\".\n- Bumped version to 1.0.3 in the documentation.\n- No functionality or code changes; documentation now reflects updated name and version.","license":"MIT-0"},"metadata":{"setup":[],"os":null,"systems":null},"owner":{"handle":"theshadowrose","userId":"s1736mx5m1zt9qzh6fvzvffnhh83hgf8","displayName":"Shadow Rose","image":"https://avatars.githubusercontent.com/u/262919821?v=4"},"moderation":{"isSuspicious":false,"isMalwareBlocked":false,"verdict":"clean","reasonCodes":["review.llm_review"],"summary":"Review: review.llm_review","engineVersion":"v2.4.24","updatedAt":1780089816493}}