Agent Provenance
v1.1.0Track authorship, review status, and governance of agent instruction files. Adds provenance headers, commit conventions, TTL on agent-written goals, and peri...
Like a lobster shell, security has layers — review code before you run it.
License
SKILL.md
Agent Provenance
Who authored your agent's instruction files?
Agents modify their own instruction files. Without tracking, human-authored rules become indistinguishable from agent-authored additions. This skill provides lightweight governance to maintain that distinction.
The Problem
An agent writes operational knowledge into the same files that contain human directives. Over time:
- No one knows which rules the human set vs. which the agent inferred
- Agent-written goals persist indefinitely without re-authorization
- Config drift is invisible — changes accumulate without review
- There's no audit trail for who changed what or when
File Authority Levels
Split files by who owns them:
| Level | Examples | Agent can modify? |
|---|---|---|
| Human-authored | Identity, principles, rules | Only with explicit human direction |
| Mixed | Operational procedures, heartbeat config | Human sets policy, agent maintains procedures |
| Agent-authored | Learnings, session state, daily notes | Agent writes freely, human reviews periodically |
The key split: human rules go in one file (e.g., PRINCIPLES.md), agent-derived learnings go in another (e.g., LEARNINGS.md). Never mix them.
Provenance Headers
Add an HTML comment header to every instruction file:
<!--
provenance: human-authored | agent-authored | mixed
description: what this file is
last-reviewed: YYYY-MM-DD
reviewed-by: [human name] | [agent name]
-->
Rules:
- Only the human updates
last-reviewedandreviewed-byon human-authored files - Agent updates these fields freely on agent-authored files
- On mixed files: agent can update
last-reviewedwhen making procedural changes, but should note it was an agent review
Commit Message Convention
Tag commits to create an audit trail:
| Tag | Meaning |
|---|---|
[human-directed] | Human explicitly asked for this change |
[agent-autonomous] | Agent decided independently |
[heartbeat] | Change made during a heartbeat cycle |
[cron] | Change made by a cron/background task |
Use in workspace/config repos only. For software projects that may be open-source or shared, use plain descriptive commit messages — provenance tags are AI fingerprints.
TTL on Agent-Written Goals
Any goal, task, or backlog item the agent writes gets a date stamp:
- [ ] Build the deploy script (added: 2026-04-01)
If an agent-written goal is older than 14 days and the human hasn't interacted with it:
- Do not silently keep following it
- Ask the human whether it's still valid
- Remove or re-authorize based on their response
This prevents stale agent-written goals from driving behavior indefinitely.
Instruction Diff Reports
Periodically (weekly recommended), diff all instruction files against their prior state:
- Compare current versions of instruction files against 7-day-old versions (use
git diff HEAD~7or similar) - For each changed file, note: what changed, who changed it (check commit tags), why
- Post the summary for human review
Implementation options:
- Cron job: Run weekly as an isolated agent task, post to a review channel
- Heartbeat task: Run during a heartbeat cycle if cron isn't available
- On-demand: Run when the human asks for a "diff report"
Include a follow-up mechanism — if the human reads but doesn't respond, nudge after a set interval (e.g., 8 hours).
Stale Review Detection
During periodic checks (heartbeat or cron), scan provenance headers:
- If any file has
last-reviewedolder than 30 days, flag it to the human - Prioritize human-authored files — those are the authority source
Setup
- Add provenance headers to all existing instruction files
- Split any file that mixes human rules and agent learnings into two files
- Add commit convention to your AGENTS.md or equivalent
- Set up a weekly diff report (cron or heartbeat)
- Backfill date stamps on any existing agent-written goals/tasks
Integration
- agent-memory: Provenance headers go on MEMORY.md and other long-lived memory files. Distillation routines should respect authority levels (don't route learnings into human-authored files).
- agent-session-state: Session state files are agent-authored by definition — provenance headers optional but useful for consistency.
Files
1 totalComments
Loading comments…
