Finrl Rl Trading

Automation

Use ensemble deep reinforcement learning (A2C, DDPG, PPO, TD3, SAC) to execute automated multi-market stock trading with

Install

openclaw skills install finrl-rl-trading

FinRL 强化学习交易 (finrl-rl-trading)

Use ensemble deep reinforcement learning (A2C, DDPG, PPO, TD3, SAC) to execute automated multi-market stock tr。

Pipeline

data_collection -> data_storage -> factor_computation -> target_selection -> trading_execution -> visualization

Top Use Cases (14 total)

Ensemble Stock Trading ICAIF 2020 (UC-101)

Executing automated stock trading using an ensemble of multiple DRL agents (A2C, DDPG, PPO, TD3, SAC) to reduce individual agent weakness and improve Triggers: ensemble trading, multiple agents, stock trading

NeurIPS 2018 DRL Training (UC-107)

Training deep reinforcement learning agents (A2C, DDPG, PPO, SAC, TD3) for automated stock trading using the StockTradingEnv environment Triggers: DRL training, stock trading, A2C

NeurIPS 2018 Ensemble Backtesting (UC-108)

Backtesting multiple trained DRL agents against baseline strategies (MVO, DJIA) to evaluate and compare ensemble trading performance Triggers: backtesting, ensemble, DRL agents

For all 14 use cases, see references/USE_CASES.md.

Execute trigger: When user intent matches intent_router.uc_entries[].positive_terms AND user uses action verb (run/execute/跑/执行/backtest/fetch/collect)

What I'll Ask You

  • Target market: A-share (default), HK, or crypto? (US stocks in ZVT are half-baked — stockus_nasdaq_AAPL exists but coverage is thin)
  • Data source / provider: eastmoney (free, no account), joinquant (account+paid), baostock (free, good history), akshare, or qmt (broker)?
  • Strategy type: MACD golden-cross, MA crossover, volume breakout, fundamental screen, or custom factor?
  • Time range: start_timestamp and end_timestamp for backtest period
  • Target entity IDs: specific stocks (stock_sh_600000) or index components (SZ1000)?

Semantic Locks (Fatal)

IDRuleOn Violation
SL-01Execute sell orders before buy orders in every trading cyclehalt
SL-02Trading signals MUST use next-bar execution (no look-ahead)halt
SL-03Entity IDs MUST follow format entity_type_exchange_codehalt
SL-04DataFrame index MUST be MultiIndex (entity_id, timestamp)halt
SL-05TradingSignal MUST have EXACTLY ONE of: position_pct, order_money, order_amounthalt
SL-06filter_result column semantics: True=BUY, False=SELL, None/NaN=NO ACTIONhalt
SL-07Transformer MUST run BEFORE Accumulator in factor pipelinehalt
SL-08MACD parameters locked: fast=12, slow=26, signal=9halt

Full lock definitions: references/LOCKS.md

Top Anti-Patterns (25 total)

  • AP-ZVT-183: 除权因子为 inf/NaN 时直接参与乘法导致复权静默失败
  • AP-ZVT-179: 第三方数据接口超限后异常被吞噬,数据静默缺失
  • AP-ZVT-183B: HFQ(后复权)与 QFQ(前复权)K 线表使用错误导致因子计算漂移

All 25 anti-patterns: references/ANTI_PATTERNS.md

Evidence Quality Notice

[QUALITY NOTICE] This crystal was compiled from blueprint finance-bp-061. Evidence verify ratio = 18.9% and audit fail total = 32. Generated results may have uncaptured requirement gaps. Verify critical decisions against source files (LATEST.yaml / LATEST.jsonl).

Reference Files

FileContentsWhen to Load
references/seed.yamlV6+ 全量权威 (source-of-truth)有行为/决策争议时必读
references/ANTI_PATTERNS.md25 条跨项目反模式开始实现前
references/WISDOM.md跨项目精华借鉴架构决策时
references/CONSTRAINTS.mddomain + fatal 约束规则冲突时
references/USE_CASES.md全量 KUC-* 业务场景需要完整示例时
references/LOCKS.mdSL-* + preconditions + hints生成回测/交易代码前
references/COMPONENTS.mdAST 组件地图(按 module 拆分)查 API 时

Compiled by Doramagic crystal-compilation-v6.1 from finance-bp-061 blueprint at 2026-04-22T13:00:18.884984+00:00. See human_summary.md for non-technical overview.