Install
openclaw skills install agent-learnerBenchmark and compare agent prompts and evaluation results. Use when tuning strategies, evaluating outputs, or comparing configurations.
openclaw skills install agent-learnerAn AI toolkit for configuring, benchmarking, comparing, and optimizing agent prompts and evaluation results. Agent Learner provides persistent, file-based logging for each command category with timestamped entries, summary statistics, multi-format export, and full-text search across all records.
| Command | Description |
|---|---|
configure | Configure agent settings — log configuration entries or view recent ones |
benchmark | Benchmark agent performance — log benchmark results or view history |
compare | Compare agent outputs — log comparison data or view recent comparisons |
prompt | Prompt management — log prompt variations or view recent prompts |
evaluate | Evaluate agent outputs — log evaluation results or view history |
fine-tune | Fine-tune parameters — log fine-tuning sessions or view recent ones |
analyze | Analyze agent behavior — log analysis entries or view recent analyses |
cost | Cost tracking — log cost data or view recent cost entries |
usage | Usage monitoring — log usage metrics or view recent usage data |
optimize | Optimize configurations — log optimization runs or view history |
test | Test agent behavior — log test results or view recent tests |
report | Report generation — log report entries or view recent reports |
stats | Show summary statistics across all log categories (entry counts, data size, first entry date) |
export <fmt> | Export all data in json, csv, or txt format to the data directory |
search <term> | Full-text search across all log files (case-insensitive) |
recent | Show the 20 most recent entries from the activity history log |
status | Health check — show version, data directory, total entries, disk usage, and last activity |
help | Show the full help message with all available commands |
version | Print the current version string |
Each data command (configure, benchmark, compare, etc.) works in two modes:
All data is stored in plain text files under the data directory:
$DATA_DIR/<command>.log — one file per command (e.g., configure.log, benchmark.log, prompt.log), each entry is timestamp|value$DATA_DIR/history.log — audit trail of every command executed with timestamps$DATA_DIR/export.<fmt> — generated by the export command in json, csv, or txt formatDefault data directory: ~/.local/share/agent-learner/
set -euo pipefail support)grep, cat, date, echo, wc, du, head, tail, basename# Initialize and check status
agent-learner status
# Log a benchmark result
agent-learner benchmark "GPT-4o on MMLU: 88.7% accuracy, 1.2s avg latency"
# Log a prompt variation
agent-learner prompt "System: You are a helpful coding assistant. Always explain your reasoning step by step."
# Compare two configurations
agent-learner compare "GPT-4o vs Claude-3.5: GPT-4o 12% faster, Claude 5% more accurate on code tasks"
# Track costs
agent-learner cost "March batch: 12,450 tokens input, 3,200 tokens output, $0.47 total"
# View all recent benchmarks
agent-learner benchmark
# Search across all logs for a specific term
agent-learner search "accuracy"
# Export all data as JSON
agent-learner export json
# View summary statistics
agent-learner stats
# Show recent activity
agent-learner recent
All commands return output to stdout. Export files are written to the data directory:
agent-learner export json # → ~/.local/share/agent-learner/export.json
agent-learner export csv # → ~/.local/share/agent-learner/export.csv
agent-learner export txt # → ~/.local/share/agent-learner/export.txt
Every command execution is logged to $DATA_DIR/history.log for auditing purposes.
Powered by BytesAgain | bytesagain.com | hello@bytesagain.com