Vx Agent Workflow

Dev Tools

Token-efficient command execution patterns for AI agents using vx. Use when running builds, tests, linting, GitHub operations, or any command that produces verbose output. Teaches agents to filter output cross-platform using vx-managed tools (vx rg, vx jq) instead of platform-specific syntax (Select-String, grep, findstr). Includes token measurement, savings tracking, and deep recipes for cargo, gh, pytest, and more.

Install

openclaw skills install vx-agent-workflow

VX Agent Workflow — Token-Efficient Command Execution

Core principle: Use vx rg as the universal cross-platform filter. Never use platform-specific tools like Select-String (PowerShell), grep (Unix), or findstr (cmd). One command, every OS.

Why This Matters

AI agents pay per-token for both input and output. A full cargo test run can produce 500+ lines when you only need the 5 lines showing failures. Agents that dump raw output waste 95% of their context window on noise.

The vx advantage: vx auto-installs rg (ripgrep), jq, fd, and other filtering tools on every platform. Use them instead of shell-specific builtins.

Token Savings Per Operation (Measured)

Based on medium-sized Rust/Python/TypeScript projects:

OperationFrequency/30minRaw tokensFilteredSavings
vx cargo test5x~25,000~2,500-90%
vx cargo build5x~15,000~1,500-90%
vx cargo clippy3x~8,000~800-90%
vx uv run pytest4x~8,000~800-90%
vx gh pr view5x~3,000~300-90%
vx gh run view --log3x~30,000~1,500-95%
vx git status10x~3,000~600-80%
vx git diff5x~10,000~2,500-75%
vx npm test5x~20,000~2,000-90%
vx just quick3x~40,000~2,000-95%
Session total~162,000~12,500-92%

The Universal Filter Pattern

vx <command> 2>&1 | vx rg "<pattern>"

This works identically on:

  • Windows PowerShell 5.1+ / 7+ (pwsh)
  • Windows cmd.exe
  • Linux Bash/Zsh
  • macOS Bash/Zsh

Why NOT platform-specific tools

PlatformNative filterProblem
PowerShellSelect-String -Pattern "..." | Select-Object -Last 30Verbose syntax, outputs objects not text, breaks on Unix
Bash/Zshgrep -E "..." | tail -30Not available on Windows without WSL
cmd.exefindstr /R "..."Limited regex, no tail equivalent, no context lines

With vx, one line replaces all of them:

vx <command> 2>&1 | vx rg -m 30 "pattern"

Deep Recipe: vx cargo (Rust Build & Test)

Build — Progressive Disclosure

# Level 1: Just pass/fail (2 tokens)
vx cargo build 2>&1 && echo "OK" || echo "FAIL"

# Level 2: Error count only (5 tokens)
vx cargo build 2>&1 | vx rg -c "^error"

# Level 3: Error messages only (~50-200 tokens)
vx cargo build 2>&1 | vx rg "^error\[" 

# Level 4: Errors + location (~200-500 tokens)
vx cargo build 2>&1 | vx rg "^error|^\s*--> "

# Level 5: Errors with context (~500-1000 tokens)
vx cargo build 2>&1 | vx rg -B 1 -A 3 "^error\["

# NEVER: Full raw output (5000-20000 tokens)
# vx cargo build  ← don't do this unless debugging a novel issue

Test — Optimal Patterns

# Cheapest: Just the summary line
vx cargo test 2>&1 | vx rg "^test result:"
# Output: test result: ok. 47 passed; 0 failed; 0 ignored

# Failed tests only (names + messages)
vx cargo test 2>&1 | vx rg "FAILED|panicked|thread .+ panicked|assertion"

# Specific crate (80% faster, 80% fewer tokens)
vx cargo test -p vx-cli 2>&1 | vx rg "test result:|FAILED"

# Specific test by name
vx cargo test -p vx-cli test_parse_args 2>&1 | vx rg "test result:|FAILED|panicked"

# JSON format for structured parsing (nightly or with --format)
vx cargo test -- -Z unstable-options --format json 2>&1 | vx jq -c 'select(.event == "failed")'

Clippy — Warnings & Errors

# Count issues (1 token answer)
vx cargo clippy 2>&1 | vx rg -c "^(warning|error)\["

# Issue types only
vx cargo clippy 2>&1 | vx rg -o "(warning|error)\[\w+\]" | sort | uniq -c

# Full warnings with file location
vx cargo clippy 2>&1 | vx rg "^(warning|error)\[|^\s*--> "

# Specific lint only
vx cargo clippy 2>&1 | vx rg "unused_variable|dead_code"

Build + Test Combined (vx just quick)

# Optimal: capture only final results from the full pipeline
vx just quick 2>&1 | vx rg "^(error|warning)\[|test result:|FAILED|Finished|could not compile"

# If expecting success (confirmation only)
vx just quick 2>&1 | vx rg -m 5 "Finished|test result: ok|All checks passed"

Tee Pattern — Save Full Output for Failure Recovery

When a command fails, save full output to a file so you can dig deeper without re-running (inspired by rtk's tee feature):

# Run once, save everything, show only errors
vx cargo test 2>&1 | vx tee .vx-last-output.log | vx rg "FAILED|error|panicked"

# If filtered output is insufficient, read the saved log
vx rg -C 5 "FAILED" .vx-last-output.log

# Alternative: redirect to file + filter separately
vx cargo test > .vx-last-output.log 2>&1; vx rg "test result:|FAILED" .vx-last-output.log

Deep Recipe: vx gh (GitHub CLI)

PR Operations — JSON Field Selection

The single most effective token-saving technique for gh is --json with field selection:

# BAD: Full PR view (500-2000 tokens of formatting)
vx gh pr view 123

# GOOD: Selected fields only (~50-100 tokens)
vx gh pr view 123 --json title,state,mergeable,reviewDecision,headRefName

# With inline jq filtering
vx gh pr view 123 --json files --jq '.files[].path'

# PR list — compact
vx gh pr list --json number,title,state,headRefName --jq '.[] | "\(.number) \(.state) \(.title)"'

CI Checks — Status Only

# BAD: Full checks output (1000+ tokens)
vx gh pr checks 123

# GOOD: Just name + conclusion (~100 tokens)
vx gh pr checks 123 --json name,state,conclusion --jq '.[] | "\(.name): \(.conclusion)"'

# Only failed checks
vx gh pr checks 123 --json name,state,conclusion --jq '.[] | select(.conclusion == "FAILURE") | .name'

# Quick pass/fail
vx gh pr checks 123 --json conclusion --jq 'all(.conclusion == "SUCCESS")'

CI Run Logs — Filtered

CI logs are the #1 token-waster (10,000-100,000+ tokens). Always filter:

# NEVER: Full log dump
# vx gh run view 789 --log  ← 10000+ tokens

# GOOD: Only errors from the log
vx gh run view 789 --log 2>&1 | vx rg -m 50 "^error|FAILED|panic|fatal|Error:"

# Failed step only
vx gh run view 789 --log-failed 2>&1 | vx rg -m 50 "error|FAILED|panic"

# Just the conclusion
vx gh run view 789 --json conclusion,status --jq '.conclusion'

# List recent runs with status
vx gh run list --json databaseId,status,conclusion,headBranch -L 5 --jq '.[] | "\(.databaseId) \(.conclusion) \(.headBranch)"'

Issues — Selective Fields

# Compact issue view
vx gh issue view 456 --json title,state,labels,body --jq '{title,state,labels: [.labels[].name]}'

# Issue list filtered
vx gh issue list --json number,title,state -L 10 --jq '.[] | "\(.number) \(.title)"'

API Calls — Always Use --jq

# Direct API with jq projection
vx gh api repos/{owner}/{repo}/actions/runs --jq '.workflow_runs[:3] | .[] | {id,conclusion,head_branch}'

# PR files changed
vx gh api repos/{owner}/{repo}/pulls/123/files --jq '.[].filename'

# Commit status
vx gh api repos/{owner}/{repo}/commits/{sha}/status --jq '.state'

Deep Recipe: Python Testing (pytest)

Basic Patterns

# Cheapest: Just the summary line
vx uv run pytest 2>&1 | vx rg "passed|failed|error" | vx rg "==="
# Output: === 42 passed, 1 failed in 3.21s ===

# Failed test names only
vx uv run pytest 2>&1 | vx rg "^FAILED "

# Short test summary (pytest's own -r flag)
vx uv run pytest --tb=no -q 2>&1
# Output: 42 passed, 1 failed in 3.21s

# Failed with short traceback
vx uv run pytest --tb=short 2>&1 | vx rg -A 5 "^FAILED|^E "

# Only run tests matching a pattern (fastest)
vx uv run pytest -k "test_parse" --tb=short -q 2>&1

pytest Native Flags That Save Tokens

Use pytest's own flags before piping — they reduce output at source:

# --tb=no: No tracebacks (just pass/fail)
vx uv run pytest --tb=no -q 2>&1

# --tb=line: One-line per failure
vx uv run pytest --tb=line 2>&1

# --tb=short: Short traceback
vx uv run pytest --tb=short 2>&1

# -q / --quiet: Minimal output
vx uv run pytest -q 2>&1

# --no-header: Skip the pytest header
vx uv run pytest --no-header -q 2>&1

# -x: Stop at first failure (don't waste time on cascade)
vx uv run pytest -x --tb=short 2>&1

# Combine for maximum efficiency
vx uv run pytest --no-header --tb=line -q 2>&1
# Output: FAILED tests/test_auth.py::test_login - AssertionError
#         1 failed, 41 passed in 2.1s

pytest JSON Output

# With pytest-json-report plugin
vx uv run pytest --json-report --json-report-file=- 2>&1 | vx jq '{summary: .summary, failed: [.tests[] | select(.outcome == "failed") | .nodeid]}'

# With JUnit XML
vx uv run pytest --junitxml=- 2>&1 | vx rg "failures=|errors="

Coverage — Summary Only

# BAD: Full coverage report (1000+ tokens)
# vx uv run pytest --cov

# GOOD: Just the total
vx uv run pytest --cov --cov-report=term-missing 2>&1 | vx rg "^TOTAL|^FAILED|passed"

# Even better: Just the percentage
vx uv run pytest --cov 2>&1 | vx rg "^TOTAL" | vx rg -o "\d+%"

Deep Recipe: vx git (Version Control)

Progressive Disclosure for Diffs

# Level 0: Any changes? (1 token)
vx git diff --quiet && echo "clean" || echo "dirty"

# Level 1: File names only (~10-30 tokens)
vx git diff --name-only

# Level 2: Stats (insertions/deletions per file, ~50-100 tokens)
vx git diff --stat

# Level 3: Specific file diff
vx git diff -- src/specific_file.rs

# Level 4: Full diff (only when you truly need it)
vx git diff

Status — Compact Forms

# Compact status
vx git status --short --branch
# Output: ## main...origin/main
#          M src/lib.rs
#         ?? new_file.rs

# Count of changes only
vx git status --short | vx rg -c "."

Log — Bounded

# One-line format, limited
vx git log --oneline -10

# Diff stat for recent commits
vx git log --oneline --stat -3

# Files changed in a range
vx git diff --name-only origin/main...HEAD

# Commit messages in a range
vx git log --oneline origin/main...HEAD

vx Built-in Compact Mode (--compact / -u)

vx has a built-in RTK-style compact mode that automatically filters subprocess output without manual piping. This is the easiest way to save tokens:

# Compact mode — vx automatically filters build/test/git output
vx --compact cargo test          # Shows only: test result + failures
vx --compact cargo build         # Shows only: errors + final status
vx --compact git status          # Shows only: short status
vx --compact gh pr view 123      # Shows only: essential fields
vx -u cargo test                 # -u is shorthand for --compact

# Filter aggressiveness levels
vx --compact --filter-level light cargo test      # Light filtering
vx --compact --filter-level normal cargo test     # Default
vx --compact --filter-level aggressive cargo test # Maximum compression

# Set globally for entire session
export VX_OUTPUT=compact
vx cargo test                    # Now always compact
vx cargo build                   # Always compact
vx git status                    # Always compact

Compact vs Manual Pipe — When to Use Which

ApproachUse whenExample
vx --compact <cmd>Quick runs, standard filters are good enoughvx --compact cargo test
<cmd> 2>&1 | vx rg "pattern"Need custom filter patternvx cargo test 2>&1 | vx rg "my_test|FAILED"
--json / --jqCommand has native JSON supportvx gh pr view N --json title,state

The --compact Mode Tracks Token Savings Automatically

Every command run with --compact, --output-format toon, or --json is automatically measured. Check cumulative savings with:

vx metrics tokens
vx metrics tokens --json    # For machine consumption

Token Measurement & Statistics

Built-in: vx metrics tokens

vx has built-in token savings tracking (similar to rtk gain). Every time you use --output-format toon, --compact, or --json, vx automatically records baseline vs actual token counts. Query the data with:

# Terminal table — shows savings per command
vx metrics tokens

# Output:
# Token savings summary
# runs:12 records:15 baseline:4500 actual:1200 net_saved:3300 (73.3%)
#
# Command                              Runs   Before    After  Net saved  Saved%
# vx list                                 5     2000      400      1600   80.0%
# vx cargo test                           3     1500      300      1200   80.0%
# vx check                                4     1000      500       500   50.0%

# JSON format — for dashboards or further processing
vx metrics tokens --json

# Last N runs only
vx metrics tokens --last 20

How Tracking Works

vx records a TokenSavingsRecord for every command that uses a structured output format (toon, json, compact). Each record captures:

FieldDescription
baseline_tokensEstimated tokens if output were unformatted text
actual_tokensActual tokens in the structured/filtered output
token_deltaPositive = tokens saved
savings_ratioFraction saved (0.0–1.0)

Token estimation: 1 token ≈ 4 UTF-8 bytes (heuristic, matches GPT/Claude tokenizers within ~10%).

Enabling Token Tracking

Token savings are tracked automatically when you use vx's built-in output modes:

# These automatically track savings
vx list --output-format toon           # TOON format (40-60% savings)
vx check --json                         # JSON format
vx --compact cargo test                 # Compact wrapper

# Set globally to track all commands
export VX_OUTPUT=toon

Manual Measurement (for piped commands)

For commands piped through vx rg, track savings manually:

# Quick comparison: raw vs filtered
vx cargo test > /tmp/vx-raw.log 2>&1
vx rg "test result:|FAILED" /tmp/vx-raw.log > /tmp/vx-filtered.log
echo "Raw: $(wc -c < /tmp/vx-raw.log) bytes → Filtered: $(wc -c < /tmp/vx-filtered.log) bytes"

# Or one-liner (runs command twice — use for benchmarking only)
echo "Saved: $(echo "scale=1; 100 - $(vx cargo test 2>&1 | vx rg 'test result:|FAILED' | wc -c) * 100 / $(vx cargo test 2>&1 | wc -c)" | bc)%"

AI Summary (For Agent Introspection)

# Full metrics with AI-friendly JSON summary
vx metrics --json

# Includes timing, stages, and token savings
# Useful for agents to self-optimize their command patterns

HTML Report (For Human Review)

# Generate visual report of all metrics including token savings
vx metrics --html report.html

Expected Savings by Command Category

CategoryRaw outputOptimal filterTokens savedTechnique
Build (cargo/go/tsc)5K-20K50-50090-99%vx rg "^error"
Test (cargo/pytest/jest)2K-25K50-20090-99%vx rg "test result:|FAILED"
Lint (clippy/ruff/eslint)2K-10K100-50085-95%vx rg "^(error|warning)\["
Git operations1K-10K100-50075-90%--stat, --name-only, --short
GitHub CLI500-5K50-20080-95%--json field1,field2 --jq
CI logs10K-100K200-100095-99%--log-failed | vx rg -m 50
File listing500-5K50-20080-95%vx fd with type filters

Advanced Patterns

The Tee-on-Failure Pattern

Save full output only when commands fail (zero cost on success):

# Bash/Zsh/PowerShell compatible
vx cargo test 2>&1 | vx rg "test result:|FAILED|panicked"; if [ $? -ne 0 ]; then vx cargo test 2>&1 > .vx-debug.log && echo "[full log: .vx-debug.log]"; fi

# Simpler: always tee, only read on failure
vx cargo test > .vx-last.log 2>&1; EXIT=$?; vx rg "test result:|FAILED" .vx-last.log; if [ $EXIT -ne 0 ]; then echo "[debug: .vx-last.log]"; fi

Deduplication — Collapse Repeated Lines

# Sort + unique count (great for repeated warnings)
vx cargo clippy 2>&1 | vx rg "^warning\[" | sort | uniq -c | sort -rn

# Output:
#   12 warning[dead_code]
#    5 warning[unused_import]
#    2 warning[unused_variable]

Cascading Filters — Cheap to Expensive

Always start with the cheapest check and only expand if needed:

# 1. Pass/fail? (2 tokens)
vx cargo test 2>&1 > /dev/null && echo "PASS" || echo "FAIL"

# 2. If FAIL → how many failures? (5 tokens)
vx rg -c "FAILED" .vx-last.log

# 3. If > 0 → which tests? (~50 tokens)
vx rg "FAILED" .vx-last.log

# 4. If unclear → context around failures (~200 tokens)
vx rg -B 2 -A 5 "FAILED" .vx-last.log

# 5. Only if truly stuck → full section
vx rg -A 30 "failures:" .vx-last.log

JSON Projection for Structured Commands

Many tools support JSON output natively — use it with vx jq:

# cargo metadata — just workspace members
vx cargo metadata --format-version=1 2>&1 | vx jq '[.workspace_members[]]'

# npm — just outdated packages
vx npm outdated --json 2>&1 | vx jq 'to_entries | .[] | "\(.key): \(.value.current) → \(.value.latest)"'

# pytest json report
vx uv run pytest --json-report --json-report-file=- 2>&1 | vx jq '.summary'

Platform-Specific Notes

Stderr Redirection (2>&1)

The 2>&1 syntax works in all shells (PowerShell, Bash, cmd). It merges stderr into stdout so vx rg can filter both streams. Always include it for build/test commands that may emit errors on stderr.

PowerShell Pipe Encoding

PowerShell may convert pipe output to UTF-16. If vx rg shows garbled output:

$env:PYTHONIOENCODING="utf-8"
[Console]::OutputEncoding = [System.Text.Encoding]::UTF8

Or use vx's subprocess environment which handles encoding automatically.

Long Commands on Windows

Windows cmd.exe has a ~8191 character limit. For complex filter patterns:

# Put pattern in a file
echo "error|FAILED|panicked|could not compile" > .vx-filter.txt
vx cargo test 2>&1 | vx rg -f .vx-filter.txt

Decision Tree for Agents

Need to run a command?
│
├─ Is this a vx native command (list, check, versions)?
│  └─ Use --json or --output-format toon
│
├─ Is this `vx gh` (GitHub CLI)?
│  └─ ALWAYS use --json field1,field2 --jq 'projection'
│
├─ Is this a test command?
│  ├─ cargo test → vx cargo test -p <crate> 2>&1 | vx rg "test result:|FAILED"
│  ├─ pytest     → vx uv run pytest --no-header --tb=line -q 2>&1
│  ├─ jest       → vx npm test 2>&1 | vx rg "Tests:|FAIL"
│  └─ go test    → vx go test ./... 2>&1 | vx rg "^(ok|FAIL)"
│
├─ Is this a build command?
│  └─ vx cargo build 2>&1 | vx rg "^error|could not compile"
│
├─ Is this a lint command?
│  └─ vx cargo clippy 2>&1 | vx rg "^(error|warning)\["
│
├─ Will output be small (<20 lines)?
│  └─ Run directly: vx <command>
│
├─ Will output be large or unknown?
│  ├─ Just pass/fail? → && echo "PASS" || echo "FAIL"
│  ├─ Error details?  → 2>&1 | vx rg "error|FAILED|panic"
│  ├─ Count only?     → 2>&1 | vx rg -c "pattern"
│  └─ Context?        → 2>&1 | vx rg -C 3 "error|FAILED"
│
└─ CI log?
   └─ vx gh run view N --log-failed 2>&1 | vx rg -m 50 "error|FAIL"

Anti-Patterns

# BAD: Platform-specific filtering
vx just test 2>&1 | Select-String -Pattern "FAILED"          # PowerShell only
vx just test 2>&1 | grep -E "FAILED"                         # Unix only
vx just test 2>&1 | findstr /R "FAILED"                      # cmd only

# GOOD: Universal with vx
vx just test 2>&1 | vx rg "FAILED"                           # Everywhere

# BAD: Dumping full verbose output
vx cargo test                                                  # 500+ lines
vx gh run view 789 --log                                      # 10000+ lines
vx gh pr view 123                                             # 500+ tokens of formatting
vx uv run pytest                                              # 200+ lines

# GOOD: Filtered to actionable information
vx cargo test 2>&1 | vx rg "test result:|FAILED"             # 2-5 lines
vx gh run view 789 --log-failed 2>&1 | vx rg -m 30 "error"  # ≤30 lines
vx gh pr view 123 --json title,state,mergeable               # ~30 tokens
vx uv run pytest --no-header --tb=line -q 2>&1               # 2-10 lines

# BAD: Full coverage / dependency trees
vx uv run pytest --cov                                        # 100+ lines table
vx npm list                                                    # huge tree

# GOOD: Summary extraction
vx uv run pytest --cov 2>&1 | vx rg "^TOTAL"                 # 1 line
vx npm list --depth=0 2>&1                                    # top-level only

# BAD: Multiple platform-aware branches
if ($IsWindows) { ... | Select-String ... } else { ... | grep ... }

# GOOD: One command everywhere
vx <cmd> 2>&1 | vx rg "pattern"

Quick Reference Card

TaskToken-efficient command~Tokens
Rust test pass/failvx cargo test 2>&1 && echo OK || echo FAIL2
Rust test summaryvx cargo test 2>&1 | vx rg "test result:"10
Rust test failuresvx cargo test 2>&1 | vx rg "FAILED|panicked"20-100
Rust build errorsvx cargo build 2>&1 | vx rg "^error"20-200
Clippy countvx cargo clippy 2>&1 | vx rg -c "^(error|warning)\["2
Clippy issuesvx cargo clippy 2>&1 | vx rg "^(error|warning)\["50-200
pytest summaryvx uv run pytest --no-header --tb=line -q 2>&110-50
pytest failuresvx uv run pytest --tb=short -q 2>&1 | vx rg "FAILED|^E "50-200
Node test summaryvx npm test 2>&1 | vx rg "Tests:|FAIL"10-50
Go test summaryvx go test ./... 2>&1 | vx rg "^(ok|FAIL)"10-50
Just quick-checkvx just quick 2>&1 | vx rg "error|FAILED|Finished"20-100
Git changesvx git diff --name-only10-30
Git statusvx git status --short --branch10-50
PR statusvx gh pr view N --json title,state,mergeable30-50
PR filesvx gh pr view N --json files --jq '.files[].path'20-50
CI statusvx gh pr checks N --json name,conclusion --jq '...'30-100
CI errorsvx gh run view N --log-failed 2>&1 | vx rg -m 30 "error"100-500
Search codevx rg -n -m 20 "pattern" src/50-200
Find filesvx fd "pattern" --type f10-50