Archive Project

v1.2.5

Organize completed projects into searchable archives with session transcript backup.

0· 179·1 current·1 all-time
byKaigeGao@kaigegao1110

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for kaigegao1110/archive-project.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Archive Project" (kaigegao1110/archive-project) from ClawHub.
Skill page: https://clawhub.ai/kaigegao1110/archive-project
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install archive-project

ClawHub CLI

Package manager switcher

npx clawhub@latest install archive-project
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
!
Purpose & Capability
The SKILL.md declares a required config path (~/.openclaw/agents/main/sessions/) and runtime use of session transcripts, which is consistent with the archive purpose — but the registry metadata lists 'Required config paths: none'. This mismatch between declared metadata and runtime instructions is an incoherence that should be resolved (the skill does need access to session files).
Instruction Scope
Runtime instructions explicitly read session transcripts, copy them into workspace/projects/<project-name>/, run the local sanitization script, git-commit the workspace, and optionally delete original session files only with user approval. Reading and processing session transcripts is expected for this skill. The instructions reference an environment variable (SESSION_TRANSCRIPT_PATH) and recommend grep/verification steps — those references are reasonable but grant the skill access to potentially sensitive local transcripts, so users should confirm they are comfortable with that scope.
Install Mechanism
No install spec / no remote downloads — instruction-only with a local sanitization script. This is low-risk from an install perspective (nothing written to disk by an installer beyond what the user does when cloning).
!
Credentials
Registry metadata lists no required environment variables, but SKILL.md and README use SESSION_TRANSCRIPT_PATH to locate transcripts. The skill will read files from the user's session directory by default. Asking for broad credentials is not present, which is good, but the undeclared env var/config-path usage is an inconsistency that should be documented in the manifest.
Persistence & Privilege
always:false and normal autonomous invocation; the skill performs workspace file writes and git commits (expected for archiving). It does not request permanent platform-wide privileges or modify other skills' configurations. Deletion of original session files is gated by explicit user approval in the instructions.
Assessment
This skill is coherent with its purpose: it reads your session transcripts, runs a local sanitization script, writes an archive into your workspace, and can (with your permission) remove originals. Before installing: (1) Confirm you are comfortable the skill will read files under ~/.openclaw/agents/main/sessions/ (or any path you set via SESSION_TRANSCRIPT_PATH). (2) Note the registry metadata doesn't declare that config path or the SESSION_TRANSCRIPT_PATH env var — ask the publisher to correct the manifest so access requirements are explicit. (3) Review scripts/sanitize_transcript.py to ensure its redaction rules meet your needs and test it on sample transcripts (the skill provides a --test mode). (4) Be aware it will run git commit in your workspace and copy transcripts into workspace/projects/, so verify where that workspace is stored and that commits are acceptable. (5) Only approve deletions when you have confirmed sanitized backups exist. If you need stronger guarantees, run the sanitize script yourself on a copy of transcripts before giving the skill any file-access or invoking it.

Like a lobster shell, security has layers — review code before you run it.

latestvk97errps93eyg8r8c8bw2jbv9s83r7tp
179downloads
0stars
19versions
Updated 1mo ago
v1.2.5
MIT-0

Installation

Option 1: ClawhHub CLI (recommended)

openclaw skills install archive-project
# or
clawhub install archive-project

Option 2: From GitHub

# Clone the repo
git clone https://github.com/KaigeGao1110/ArchiveProject.git ~/.openclaw/skills/archive-project

# Or download directly
curl -L https://github.com/KaigeGao1110/ArchiveProject/archive/refs/heads/main.zip -o /tmp/archive-project.zip
unzip /tmp/archive-project.zip -d ~/.openclaw/skills/
mv ~/.openclaw/skills/ArchiveProject-main ~/.openclaw/skills/archive-project

Archive Project Skill

Organize a completed project into a complete, long-term searchable archive.

Data Privacy: Archived data (session transcripts, project files) never leaves the internal workspace unless you explicitly approve a publish step. The sanitize script is applied automatically before any archival.


Trigger Conditions

Archive is triggered only when you say "archive this" or "can we archive this". This is the only trigger — you always decide when a project is done.

Trigger 2: Slash command

Type //archive followed by your project name to activate the Archive skill. Example: "//archive cureforge-hr-assessment"

However, in these scenarios, I will prompt but not execute:

  • A delivery action just happened (email sent, demo link generated, all subagents done, code committed)
  • You start a new project or say "next task" / "different topic"

I will NOT prompt when:

  • Project is still in active development
  • Task is ongoing operations
  • Waiting on external feedback (48h+ silence)

Archive Flow

Step 1: Create project archive directory

workspace/projects/<project-name>/
  ARCHIVE.md
  session_transcript.jsonl
  subagent_sessions/
  deliverables/
  decisions.md

Step 2: Collect session transcripts

Subagent sessions (important — must collect):

# Directory containing session transcripts (configurable via SESSION_TRANSCRIPT_PATH)
# Default: ~/.openclaw/agents/main/sessions/ (standard for all users)
# Override: set SESSION_TRANSCRIPT_PATH to a custom path (e.g., EFS mount)
SESSION_DIR="${SESSION_TRANSCRIPT_PATH:-$HOME/.openclaw/agents/main/sessions/}"

# Find main session transcript using explicit session key (from session label or passed argument)
# Use the session key/label to match the exact transcript file
SESSION_KEY="${1:-}"  # Pass session key as argument or extract from context
if [ -n "$SESSION_KEY" ]; then
  MAIN_SESSION_PATH=$(grep -l "$SESSION_KEY" "${SESSION_DIR}"*.jsonl 2>/dev/null | head -1)
fi
# Fallback: if no key provided or not found, use most recent transcript
if [ -z "$MAIN_SESSION_PATH" ] || [ ! -f "$MAIN_SESSION_PATH" ]; then
  MAIN_SESSION_PATH=$(ls -t "${SESSION_DIR}"*.jsonl 2>/dev/null | head -1)
fi

# Create project archive directory
mkdir -p workspace/projects/<project-name>/subagent_sessions/

# Copy main session transcript
cp "$MAIN_SESSION_PATH" "workspace/projects/<project-name>/session_transcript.jsonl"

Child subagent transcripts:

# Child subagent session IDs are listed in the main session JSONL
# Look for "childSessions" array in the session metadata
# Copy each child session transcript to subagent_sessions/
# Pattern: {SESSION_DIR}/{child-id}.jsonl

Step 3: Sanitize transcripts (CRITICAL — must do before archiving)

Before archiving, remove:

  • API keys, tokens, and authentication credentials
  • Personal contact information (emails, phone numbers)
  • Internal infrastructure details (hostnames, IPs)
  • Any sensitive environment variables

Use the sanitization script:

python3 scripts/sanitize_transcript.py \
  workspace/projects/<project-name>/session_transcript.jsonl \
  -o workspace/projects/<project-name>/session_transcript_sanitized.jsonl

The script redacts:

  • API keys (GitHub tokens, OpenAI keys, AWS credentials, etc.)
  • Email addresses
  • Phone numbers
  • IP addresses (IPv4 and IPv6)
  • Internal hostnames and AWS EC2 DNS names
  • Generic secrets and high-entropy tokens

Verify before proceeding:

# Run built-in tests to confirm redaction works
python3 scripts/sanitize_transcript.py --test

# Manual spot-check (look for any remaining sensitive data)
grep -iE '(token|key|password|email|phone|@|192\.168|10\.)' \
  workspace/projects/<project-name>/session_transcript_sanitized.jsonl || echo "No sensitive data found"

After verification, replace the original with the sanitized version:

mv workspace/projects/<project-name>/session_transcript_sanitized.jsonl \
   workspace/projects/<project-name>/session_transcript.jsonl

Step 4: Write ARCHIVE.md

Use the template below. Fill in decision rationale — this is the most valuable part for future retrospectives.

Step 5: Update MEMORY.md

Add a one-line summary to MEMORY.md: project name + status + link.

Step 6: Delete EFS session files (requires approval)

Before deleting any session files from EFS, ask the user:

"Can I delete the EFS session files for this project? They are already backed up in the archive."

Only proceed if the user explicitly approves. Never auto-delete without asking.

If approved:

# Remove the main session transcript from EFS
rm -f "${SESSION_DIR}$(basename "$MAIN_SESSION_PATH")"

# Remove any child subagent session transcripts from EFS
for CHILD_ID in <child-session-ids>; do
  rm -f "${SESSION_DIR}${CHILD_ID}.jsonl"
done

If not approved, leave the EFS session files as-is.

Step 7: Git commit (internal workspace only)

cd workspace
git add projects/<project-name>/
git commit -m "Archive: <project-name>"

Keep project data private. Archive data is for internal reference only.


ARCHIVE.md Template

# <Project Name> — Project Archive

_Created: <date> | Owner: <owner> | Status: <status>_

---

## One-Line Summary

<1-2 sentences: what this project does, who it's for, its core value>

---

## Project Background

### Client
<Name + contact info — after archiving, record only what is needed for future reference>

### Source Materials
| File | Content |
|------|---------|
| <file1> | <description> |
| <file2> | <description> |

---

## Deliverables

### Code / Product
| Path | Description |
|------|-------------|
| <path> | <description> |

### Reports / Docs
| File | Description |
|------|-------------|
| <file> | <description> |

### Demo / Links
| Link | Description |
|------|-------------|
| <URL> | <description> |

---

## Timeline

| Date | Event |
|------|-------|
| YYYY-MM-DD | <event> |
| YYYY-MM-DD | <delivery> |

---

## Key Decisions

### N. <Decision Title>
**Options:** A vs B (chose A)
**Rationale:** <why this choice>
**Outcome:** <what happened>

---

## Open Items

| Item | Description | Priority |
|------|-------------|----------|
| <item> | <description> | High/Med/Low |

---

## Lessons Learned

### N. <Lesson Title>
<What was learned, what to do differently next time>

---

## Git Commits (Internal)

| Stage | Commit | Description |
|-------|--------|-------------|
| Initial | <hash> | <description> |
| Delivery | <hash> | <description> |

---

## Reconstruction Guide

```bash
<reconstruction commands>

---

## decisions.md Template

```markdown
# Key Decisions — <project-name>

## Decision N
- Date:
- Problem:
- Options:
  - A: <description>
  - B: <description>
- Decision: <what was chosen>
- Rationale: <why>

Sanitization Script Reference

The scripts/sanitize_transcript.py script provides deterministic, audited redaction of sensitive data from session transcripts.

What it redacts

CategoryExamplesReplacement
GitHub tokensghp_xxx, github_pat_xxx[REDACTED-GITHUB-TOKEN]
OpenAI keyssk-xxx, sk-proj-xxx[REDACTED-OPENAI-KEY]
Anthropic keyssk-ant-xxx[REDACTED-ANTHROPIC-KEY]
AWS credentialsAKIAxxx, aws_access_key_id=xxx[REDACTED]
Email addressesuser@example.com[REDACTED-EMAIL]
Phone numbers+1 555-123-4567[REDACTED-PHONE]
IPv4 addresses192.168.1.1, 10.0.0.1[REDACTED-IP]
IPv6 addresses2001:db8::1[REDACTED-IPV6]
Internal hostnamesip-10-0-1-43.local[REDACTED-HOSTNAME]
AWS EC2 DNSec2-xxx.amazonaws.com[REDACTED-AWS-HOST]
Generic secretsHigh-entropy base64/hex strings[REDACTED-SECRET]

Usage

# Basic usage — output to stdout
python3 scripts/sanitize_transcript.py input.jsonl > sanitized.jsonl

# Explicit output file
python3 scripts/sanitize_transcript.py input.jsonl -o sanitized.jsonl

# Read from stdin
cat input.jsonl | python3 scripts/sanitize_transcript.py > sanitized.jsonl

# Run built-in tests
python3 scripts/sanitize_transcript.py --test

Properties

  • Deterministic: Same input always produces identical output
  • Non-destructive: Original file is never modified
  • Structure-preserving: JSON/JSONL structure is maintained; only string values are redacted
  • Testable: Built-in test mode verifies redaction patterns

Comments

Loading comments...