DriftWatch — Agent Identity Drift Monitor

v1.0.0

Monitor agent identity drift using git history. Detects when AI agents quietly modify their own SOUL.md, IDENTITY.md, AGENTS.md, or memory files — autonomy e...

0· 280·2 current·2 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
!
Purpose & Capability
The declared purpose (audit git history of agent identity files) matches the code: it runs git, diffs tracked files, classifies severity and writes a report. However the SKILL.md/README mention using an Anthropic API key for LLM semantic analysis while the registry metadata does not declare any required env vars or binaries. The code also hardcodes many agent memory paths instead of using a pattern (agents/*), which is a minor incoherence but not necessarily harmful.
!
Instruction Scope
SKILL.md emphasizes 'read-only' and 'safe to run', which is true for local git and file reads. But the LLM mode (enabled by default unless --no-llm) submits diffs (capped but up to multiple kilobytes each) to an external Claude CLI. That transmits potentially sensitive workspace content (identity, memory snippets, user notes) to an external service. The SKILL.md mentions ANTHROPIC_API_KEY in usage but does not prominently warn that enabling LLM shares workspace text externally.
Install Mechanism
No install spec (instruction-only with included code) — nothing is downloaded or extracted during install. The code runs local git and the 'claude' CLI if LLM mode is used. Low install mechanism risk, but runtime binary dependencies are not declared.
!
Credentials
Metadata lists no required env vars, but the README and SKILL.md say LLM analysis requires ANTHROPIC_API_KEY and the code invokes a 'claude' CLI. The skill implicitly requires the 'git' and 'claude' binaries and an Anthropic credential (or functioning claude CLI config). Those credentials/environment requirements are not declared in the registry metadata, which is a mismatch and means users may unknowingly exfiltrate data when LLM mode is used.
Persistence & Privilege
always:false and no modifications to other skills or system-wide settings. The script writes reports to its own skill directory (OUTPUT_DIR) and does not attempt to enable itself or persist credentials. No elevated persistence requested.
What to consider before installing
What to consider before installing/using DriftWatch: - Functional fit: The script does what it claims: it reads your repo/git history, computes diffs for identity files, and writes a markdown report. Running the heuristic-only mode (--no-llm) is low-risk and stays entirely local. - LLM / data exfiltration risk: Enabling LLM analysis causes the tool to call the 'claude' CLI and submit git diffs (snippets) to Anthropic. That can transmit sensitive agent identity content and memory snippets off your machine. The registry metadata does NOT declare ANTHROPIC_API_KEY or the 'claude' binary requirement — so the privacy/credential implication is not made explicit. - Missing declarations: The skill uses 'git' and 'claude' via subprocess.run and implicitly requires the Anthropic credential/CLI configuration, but these runtime dependencies are not listed in the skill metadata. Treat LLM mode as networked and external by default. - Recommended safe steps: - Inspect the code (you already have it). Confirm the WORKSPACE path and TRACKED_FILES align with where you want to run it. - Run first with --no-llm to generate a heuristic-only, local report. - If you need LLM analysis, prefer running it in an environment where you control the Anthropic key and are comfortable sending those diffs externally; explicitly set and review the 'claude' CLI configuration first. - Consider editing TRACKED_FILES or the WORKSPACE constant to limit what is read, or replace the LLM call with a local model if you want semantic analysis without network exposure. - Add this to a sandboxed environment or CI job with least privilege if you will run it automatically (cron/heartbeat). - When to be cautious: Do not enable LLM mode if the tracked files contain secrets, personally identifiable information, or memory entries you do not want sent to an external provider. Also verify that the script's WORKSPACE path does not point outside your intended repository. If you want, I can point to the exact lines that invoke 'claude' and show the code locations that read files and set WORKSPACE so you can more easily review or modify them.

Like a lobster shell, security has layers — review code before you run it.

latestvk97ej2j7styk24xaw0erjtn9kn82ahm2
280downloads
0stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

DriftWatch 🔍

Agent Identity Drift Monitor for OpenClaw workspaces

Uses your workspace's existing git history to track changes to agent identity files. For each change it classifies severity, optionally runs LLM semantic analysis, and outputs a human-readable markdown report.

Usage

# Full report, last 30 days (heuristic only, fast)
python3 skills/driftwatch/driftwatch.py --no-llm --days 30

# With LLM semantic analysis (requires ANTHROPIC_API_KEY)
python3 skills/driftwatch/driftwatch.py --days 30

# Last 7 days
python3 skills/driftwatch/driftwatch.py --no-llm --days 7

# Cron/heartbeat mode: silent unless concerns found
python3 skills/driftwatch/driftwatch.py --cron --days 7

What it tracks

  • SOUL.md — core personality and values
  • IDENTITY.md — agent name, creature, vibe
  • AGENTS.md — operational rules and protocols
  • USER.md — what agents know about their human
  • TOOLS.md — tool and access notes
  • agents/*/MEMORY-INDEX.md — per-agent active context

Output

Writes a markdown report to the skill directory. Flags:

  • 🟡 Medium: human should review
  • 🔴 High: potential concern — review before next agent session

Add to weekly heartbeat

## Weekly Drift Check (Mondays)
Run: python3 skills/driftwatch/driftwatch.py --cron --days 7

Read-only. Does not modify any files.

Comments

Loading comments...