Langfuse Trace Logger

v1.0.0

Log subagent task completions as Langfuse traces for replay, evaluation, and cost analysis. Called during session-wrap Phase 4. Supports backfill, tag-based...

⭐ 0· 84·0 current·0 all-time

byNissan Dookeran@nissan

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for nissan/langfuse-trace-logger.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Langfuse Trace Logger" (nissan/langfuse-trace-logger) from ClawHub.
Skill page: https://clawhub.ai/nissan/langfuse-trace-logger
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY
Required binaries: python3
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install langfuse-trace-logger

ClawHub CLI

Package manager switcher

npx clawhub@latest install langfuse-trace-logger

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The name/description (logging traces to Langfuse) align with the required env vars LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY and the need for python. However, the SKILL.md expects specific scripts (e.g., /Users/loki/.openclaw/workspace/scripts/langfuse-trace-logger.py) and a chatterbox venv to already exist; the skill bundle includes no code or install steps to create those scripts or the venv, which is a coherence gap.

Instruction Scope

Instructions direct the agent to run local scripts and to parse memory/YYYY-MM-DD.md files for backfill. Reading local 'memory' files can expose sensitive user data; the backfill behavior and file paths are outside the skill's code and may access private information. The README also references runtime env vars (e.g., SESSION_ID examples) and absolute home paths (/Users/loki/...) that may not exist for other users — the agent could be instructed to read or transmit data the user wouldn't expect.

✓

Install Mechanism

This is an instruction-only skill with no install spec and no code files, so it does not download or write code. That lowers installation risk but also means it assumes preexisting scripts and environments; there's no bundled code to inspect or validate.

ℹ

Credentials

Requesting the two Langfuse keys is proportional to the described function (sending traces). Still: LANGFUSE_SECRET_KEY is sensitive and would allow writing traces to a Langfuse account; ensure the keys are scoped to the intended account/project. The SKILL.md references other local state (memory files, SESSION_ID) that are not declared as required envs but are used by the scripts, which broadens the effective access.

✓

Persistence & Privilege

always is false and the skill does not request any persistent platform privileges. It does not modify other skills' configs nor ask to be force-enabled; autonomous invocation is allowed (platform default) but not an added privilege here.

What to consider before installing

This skill appears to be a wrapper around existing local scripts that send traces to Langfuse — the credential requests match that purpose, but the skill bundle contains no code and assumes scripts and a specific Python venv exist. Before installing or enabling it: (1) verify the referenced scripts actually exist at the stated paths and inspect their contents to see exactly what files they read and where they send data; (2) prefer using a self-hosted Langfuse endpoint (localhost:3100) for sensitive logs or supply keys scoped with minimal write permissions; (3) confirm the chatterbox venv Python (3.11) is used — the SKILL.md warns about silent failure on other Python versions; (4) be aware the backfill feature parses memory/YYYY-MM-DD.md files (potentially sensitive) — if you don't want that data exported, do not run backfill or audit the parser first; (5) if you cannot inspect the scripts or do not trust the source (homepage unknown, source unknown), do not provide LANGFUSE_SECRET_KEY; consider creating a dedicated, limited-permission key or testing in an isolated environment. Additional info (script contents, where traces are posted) would raise confidence and could change this assessment.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

📈 Clawdis

Binspython3

EnvLANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY

Primary envLANGFUSE_PUBLIC_KEY

latestvk972vjj3m6mtsf5s69ctwybccx83s7ps

84downloads

0stars

1versions

Updated 1mo ago

v1.0.0

MIT-0

Skill: langfuse-trace-logger

Purpose: Log subagent task completions as Langfuse traces for replay, evaluation, and cost analysis. Scope: Called by Loki at the end of every session wrap (Phase 4) for each significant subagent completion. Script: /Users/loki/.openclaw/workspace/scripts/langfuse-trace-logger.py

⚠️ CRITICAL: Python Version

Always use ~/.chatterbox-venv/bin/python3 (Python 3.11.15)

The langfuse SDK uses pydantic v1, which is incompatible with Python 3.14. Running with system Python (python3) or pyenv Python (3.14.x) causes silent failure — no import error, no exception, trace just doesn't appear in Langfuse UI. This will waste 30+ minutes of debugging.

# ✅ Correct
~/.chatterbox-venv/bin/python3 scripts/langfuse-trace-logger.py ...

# ❌ Wrong — silent failure on Python 3.14
python3 scripts/langfuse-trace-logger.py ...
/Users/loki/.pyenv/versions/3.14.3/bin/python3 scripts/langfuse-trace-logger.py ...

Basic Invocation

~/.chatterbox-venv/bin/python3 /Users/loki/.openclaw/workspace/scripts/langfuse-trace-logger.py \
  --session-id "$SESSION_ID" \
  --parent-id "agent:main" \
  --agent "kit" \
  --task "task-label-kebab-case" \
  --model "anthropic/claude-sonnet-4-6" \
  --status "completed" \
  --input "full task prompt given to agent (first 4000 chars)..." \
  --output "what the agent returned or accomplished..." \
  --duration 278 \
  --tokens 16900 \
  --project "reddi-agent-protocol" \
  --skills "product-tour-capture"

Trace Schema

Field	Type	Purpose	Notes
`--session-id`	string	Subagent session key	Use actual subagent session key — enables lineage tracing
`--parent-id`	string	Parent session reference	Always `"agent:main"` unless nested subagent
`--agent`	string	Agent name	Lowercase: kit, archie, sara, finn, quill, etc.
`--task`	string	Task label (kebab-case)	Used for replay grouping: `replay-judge.py --tag "task:kit-setup-rebuild"`
`--model`	string	Model used	e.g. `anthropic/claude-sonnet-4-6`, `anthropic/claude-haiku-4-5`
`--status`	string	Outcome	`completed` / `partial` / `failed`
`--input`	string	Full task prompt	First 4000 chars — this is what gets replayed against other models in judge runs
`--output`	string	Result summary	Agent's output/result — this is what the judge scores
`--duration`	int	Time in seconds	Used for efficiency analysis and agent routing decisions
`--tokens`	int	Total tokens used	Used for cost analysis and budget governance
`--project`	string	Project slug	Must match `projects/<slug>/STATUS.md` — enables project-level filtering
`--skills`	string	Comma-separated skills	e.g. `"product-tour-capture,ffmpeg-studio"` — enables skill effectiveness filtering

Tag Taxonomy

The logger automatically generates these tags from the fields above:

agent:kit — from --agent
model_family:claude-sonnet — derived from --model
project:reddi-agent-protocol — from --project
skill:product-tour-capture — one tag per skill in --skills
task:kit-setup-rebuild — from --task
status:completed — from --status

These tags power the replay-judge filter syntax.

Backfill Pattern

For retroactive logging when a session wrap was skipped or traces are missing.

Idempotent: Uses deterministic trace IDs based on date+agent+task hash. Safe to re-run — won't create duplicates.

# Preview first (dry run)
~/.chatterbox-venv/bin/python3 scripts/langfuse-backfill-historical.py \
  --from-date 2026-03-24 \
  --to-date 2026-03-24 \
  --dry-run

# Then run for real
~/.chatterbox-venv/bin/python3 scripts/langfuse-backfill-historical.py \
  --from-date 2026-03-24 \
  --to-date 2026-03-24

Data source: Backfill parses memory/YYYY-MM-DD.md files and extracts structured task outcome blocks. This is why the task outcome block format in memory files must be consistent — inconsistent format breaks parsing silently.

Backfill ID format: backfill-YYYY-MM-DD-<agent>-<task-slug> — deterministic, no duplicate risk.

Replay and Judge

# Report on all Kit traces (past 30 days)
~/.chatterbox-venv/bin/python3 scripts/replay-judge.py \
  --tag "agent:kit" --report

# Compare all Kit traces against Haiku (cost reduction analysis)
~/.chatterbox-venv/bin/python3 scripts/replay-judge.py \
  --tag "agent:kit" --models "claude-haiku-4-5" --judge "claude-haiku-4-5" --report

# Judge a specific trace
~/.chatterbox-venv/bin/python3 scripts/replay-judge.py \
  --trace-id "backfill-2026-03-24-kit-setup-rebuild" \
  --models "claude-haiku-4-5" --judge "claude-haiku-4-5"

# Filter by project
~/.chatterbox-venv/bin/python3 scripts/replay-judge.py \
  --tag "project:reddi-agent-protocol" --report

# Filter by skill
~/.chatterbox-venv/bin/python3 scripts/replay-judge.py \
  --tag "skill:product-tour-capture" --report

Verify Traces Appeared

After logging, verify in Langfuse UI: http://localhost:3100

Or check programmatically:

~/.chatterbox-venv/bin/python3 -c "
import subprocess
sk = subprocess.run(
    ['op', 'read', 'op://OpenClaw/Langfuse (Local)/credential'],
    capture_output=True, text=True
).stdout.strip()
from langfuse import Langfuse
lf = Langfuse(public_key='pk-lf-openclaw-local', secret_key=sk, host='http://localhost:3100')
traces = lf.client.trace.list(limit=5)
[print(t.name, t.id[:12]) for t in traces.data]
"

Expected output: last 5 trace names + truncated IDs. If blank, Python version issue (see warning above).

Langfuse Connection Details

Setting	Value
UI	http://localhost:3100
Public key	`pk-lf-openclaw-local`
Secret key	`op://OpenClaw/Langfuse (Local)/credential` (1Password)
Also in 1Password	`op://OpenClaw/Langfuse (Local)/Secret Key`
Docker	Always running (daemon service)

When to Call This Skill

This skill is called during Phase 4 (Traces) of the session-wrap playbook (playbooks/session-wrap/PLAYBOOK.md).

Call once per significant subagent completion. Use data from the task outcome blocks written in Phase 1 (memory file). Don't reconstruct from memory — read what you just wrote.

Minimum threshold for logging: Any subagent run that produced a deliverable (file written, API called, analysis produced). Skip: simple lookups, 1-line tool calls, failed attempts with no output.

Troubleshooting

Symptom	Cause	Fix
Trace doesn't appear in UI	Wrong Python version	Use `~/.chatterbox-venv/bin/python3`
No output, no error	Same — Python 3.14 pydantic v1 incompatibility	Same fix
`ImportError: langfuse not found`	Wrong venv	Same fix
Duplicate traces on backfill	Shouldn't happen — backfill is idempotent	Check if running logger + backfill both for same trace
`op: command not found`	1Password CLI not in PATH	Run from shell with OP_SERVICE_ACCOUNT_TOKEN set, or source `~/.zshrc` first
Langfuse UI empty after logging	Docker daemon down	`docker ps` — restart Langfuse container if needed

Comments

Loading comments...