Strategy Backtester

Dev Tools

Validates historical behavior of stock ranking, factor, and portfolio-selection strategies using reproducible backtests, benchmark comparison, turnover, drawdown, and bias warnings.

Install

openclaw skills install strategy-backtester

Strategy Backtester

Purpose

Use this skill to test whether a ranking, factor mix, or portfolio-selection rule had useful historical behavior before treating it as an investment signal.

Scope

  • Equity ranking and selection strategies.
  • Periodic rebalance backtests from local CSV inputs.
  • Benchmark comparison when benchmark data is available.
  • Bias and robustness review.

Non-goals

  • Do not claim that historical performance predicts future returns.
  • Do not optimize parameters until a preferred result appears.
  • Do not issue absolute buy/sell instructions.
  • Do not fetch live market data.

Input contract

Required inputs:

  • SIGNAL_CSV: rows with date, ticker, and score.
  • PRICE_CSV: rows with date, ticker, and close.
  • REBALANCE_FREQUENCY: monthly, quarterly, or yearly.
  • TOP_N: number of selected names per rebalance.

Optional inputs:

  • BENCHMARK_CSV: rows with date and close or return.
  • FEE_BPS: round-trip fee assumption in basis points.
  • SLIPPAGE_BPS: slippage assumption in basis points.
  • UNIVERSE_HISTORY: point-in-time membership if available.

Execution workflow

  1. Validate input files and required columns.
  2. Estimate whether the test window and symbol coverage are sufficient.
  3. Run scripts/backtest_strategy.py with explicit rebalance, fee, slippage, and top-N assumptions.
  4. Review performance metrics and benchmark comparison.
  5. Identify bias risks and robustness gaps.
  6. Return the required output sections.

Required output format

  1. Backtest Setup
  • Strategy name, test window, rebalance frequency, top-N, fees, slippage, benchmark.
  1. Performance Summary
  • Total return, CAGR, volatility, max drawdown, Sharpe, Sortino, turnover, hit rate when available.
  1. Benchmark Comparison
  • Relative return, relative drawdown, and tracking observations when benchmark data exists.
  1. Robustness and Bias Warnings
  • Survivorship bias, lookahead bias, data-snooping risk, liquidity assumptions, fee/slippage sensitivity.
  1. Confidence and Data Gaps
  • Confidence level and missing inputs that could change the conclusion.
  1. Handoff Bundle
  • Include strategy_name, test_window, rebalance_frequency, fee_assumption, slippage_assumption, benchmark, metrics, bias_warnings, confidence, and data_gaps.

Shared confidence rubric

  • High: point-in-time signals, adequate price coverage, benchmark available, fees/slippage included, and test window covers multiple market regimes.
  • Medium: usable history and price coverage, but one major robustness input is missing.
  • Low: short history, missing benchmark, sparse price coverage, likely survivorship/lookahead risk, or no fee/slippage assumptions.

Guardrails

  • Separate observed backtest results from assumptions and inference.
  • Always state that backtests are historical simulations, not forecasts.
  • Downgrade confidence if the test appears overfit or data is not point-in-time.
  • Treat backtest output as one input to stock-picker-orchestrator, not as a trading command.

Trigger examples

  • "Backtest this VN30 value-quality ranking."
  • "Check whether this stock ranking strategy beat VNINDEX historically."
  • "Validate this screening rule before using it for shortlist selection."