# Data Reference
## Overview
The FinLab data module provides comprehensive access to stock market data, including prices, financial statements, fundamental metrics, institutional trading, and economic indicators. Use `data.get()` to retrieve any dataset using a simple path-based syntax.
---
## Table of Contents
- [Usage](#usage) - Basic data retrieval with `data.get()`
- [Data Discovery](#data-discovery) - Find datasets with `data.search()`
- [Technical Indicators](#technical-indicators) - Compute indicators with `data.indicator()`
- [Universe Filtering](#universe-filtering) - Filter by market/category
- [Data Catalog](#data-catalog) - Complete dataset reference
- [Storage Configuration](#storage-configuration) - Cache and storage settings
- [Plotting Data](#plotting-data) - Visualization examples
---
## Usage
### Basic Syntax
```python
from finlab import data
# Retrieve data using TABLE:COLUMN format
df = data.get('price:收盤價')
# For tables without columns, use TABLE directly
inventory = data.get('inventory')
```
### Data Path Construction
The path is constructed using colons (`:`) to navigate hierarchical data structures:
**Format:**
- `
:` - For tables with multiple columns
- `` - For tables without column structure
**Example:**
```python
# Get closing price
closing_price = data.get('price:收盤價')
# Get earnings per share
eps = data.get('financial_statement:每股盈餘')
# Get foreign investor trading volume
foreign_investment = data.get('institutional_investors_trading_summary:外陸資買賣超股數(不含外資自營商)')
```
### Common Field Names
| Term | Field |
|------|-------|
| EPS | `financial_statement:每股盈餘` |
| 本益比 | `price_earning_ratio:本益比` |
| 股價淨值比 | `price_earning_ratio:股價淨值比` |
| 殖利率 | `price_earning_ratio:殖利率(%)` |
| ROE | `fundamental_features:ROE稅後` |
| 毛利率 | `fundamental_features:營業毛利率` |
| 月營收 | `monthly_revenue:當月營收` |
| 營收年增率 | `monthly_revenue:去年同月增減(%)` |
---
## Data Discovery
Use `data.search()` to programmatically find datasets from the Data Catalog.
**Signature:**
```python
data.search(keyword: str = None, market: str = 'tw') -> list
```
**Parameters:**
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `keyword` | str | None | Filter datasets by keyword (case-insensitive substring match). Returns all if omitted. |
| `market` | str | 'tw' | Market to search: `'tw'`, `'us'`, or `'all'` (both markets). |
**Returns:** List of `"table:column"` strings, usable directly with `data.get()`.
**Examples:**
```python
from finlab import data
# List all available datasets
all_data = data.search()
# Search by keyword
data.search('收盤') # ['price:收盤價']
data.search('營收') # ['monthly_revenue:當月營收', ...]
# Search US market data (testing)
data.search('close', market='us')
# Use result with data.get()
results = data.search('收盤')
df = data.get(results[0])
```
---
## Technical Indicators
Use `data.indicator()` to compute technical indicators using TA-Lib. Returns a FinLabDataFrame with the same structure as price data (date index, stock columns).
**Signature:**
```python
data.indicator(
indname: str,
adjust_price: bool = False,
resample: str = 'D',
**kwargs
) -> FinLabDataFrame | tuple
```
**Parameters:**
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `indname` | str | required | Indicator name: `SMA`, `EMA`, `RSI`, `MACD`, `STOCH`, `BBANDS`, `ATR`, `ADX`, etc. |
| `adjust_price` | bool | False | Use adjusted prices for calculation. |
| `resample` | str | 'D' | Data frequency: `'D'` (daily), `'W'` (weekly), `'M'` (monthly). |
| `**kwargs` | - | - | Indicator-specific parameters (e.g., `timeperiod`, `fastperiod`, `slowperiod`). |
**Returns:**
- Single indicator (SMA, RSI, etc.): `FinLabDataFrame`
- Multi-output indicator (STOCH, MACD, BBANDS): `tuple` of `FinLabDataFrame`
**Common Indicators:**
| Indicator | Parameters | Returns | Description |
|-----------|------------|---------|-------------|
| `SMA` | `timeperiod=30` | DataFrame | Simple Moving Average |
| `EMA` | `timeperiod=30` | DataFrame | Exponential Moving Average |
| `RSI` | `timeperiod=14` | DataFrame | Relative Strength Index (0-100) |
| `STOCH` | `fastk_period=5, slowk_period=3, slowd_period=3` | (k, d) tuple | Stochastic Oscillator |
| `MACD` | `fastperiod=12, slowperiod=26, signalperiod=9` | (macd, signal, hist) tuple | MACD |
| `BBANDS` | `timeperiod=5, nbdevup=2, nbdevdn=2` | (upper, middle, lower) tuple | Bollinger Bands |
| `ATR` | `timeperiod=14` | DataFrame | Average True Range |
| `ADX` | `timeperiod=14` | DataFrame | Average Directional Index |
**Examples:**
```python
from finlab import data
# Simple moving average
sma20 = data.indicator('SMA', timeperiod=20)
sma60 = data.indicator('SMA', timeperiod=60)
# RSI
rsi = data.indicator('RSI', timeperiod=14)
oversold = rsi < 30
# Stochastic oscillator (returns tuple)
k, d = data.indicator('STOCH', adjust_price=True)
# MACD (returns tuple)
macd, signal, hist = data.indicator('MACD')
# Bollinger Bands (returns tuple)
upper, middle, lower = data.indicator('BBANDS', timeperiod=20)
# Weekly RSI
rsi_weekly = data.indicator('RSI', timeperiod=14, resample='W')
```
---
## Universe Filtering
Limit the data fetch scope by market or industry category using a context manager or global settings.
### Supported Markets
- `ALL` - All markets
- `TSE` - Listed stocks (上市)
- `OTC` - Over-The-Counter (上櫃)
- `TSE_OTC` - Both TSE and OTC
- `ETF` - Exchange Traded Funds
### Supported Categories
**Industry Categories:**
光電業, 其他, 其他電子業, 化學工業, 半導體, 塑膠工業, 存託憑證, 建材營造, 文化創意業, 橡膠工業, 水泥工業, 汽車工業, 油電燃氣業, 玻璃陶瓷, 生技醫療, 生技醫療業, 紡織纖維, 航運業, 觀光事業, 貿易百貨, 資訊服務業, 農業科技, 通信網路業, 造紙工業, 金融, 鋼鐵工業, 電器電纜, 電子商務, 電子通路業, 電子零組件, 電機機械, 電腦及週邊, 食品工業
**ETF Categories:**
domestic_etf, foreign_etf, leveraged_etf, vanilla_futures_etf, leveraged_futures_etf
### Parameters
- `market` (str): One of 'ALL', 'TSE', 'OTC', 'TSE_OTC', 'ETF'
- `category` (str or list): Industry NAMES only (no numeric codes); supports regex fuzzy match (e.g. '電子' matches multiple electronics categories)
- `exclude_category` (str or list or None): Excluded industry NAMES only (no numeric codes); same regex rules as category
### Important Notes
- Regex matching is used for categories. To match exact '其他', use '^其他$'
- When both category and exclude_category are provided, select category first, then subtract exclude_category
- Use data.universe ONLY to scope data.get() or backtest.sim() — do NOT wrap DataFrame/factor operations (e.g., position = ...)
- Do NOT use category codes (代號) like '28'; use industry names instead (e.g., exclude_category='金融')
### Examples
**Context Manager (Recommended):**
```python
from finlab import data
# Filter by market and category
with data.universe(market='TSE_OTC', category=['水泥工業']):
price = data.get('price:收盤價')
# Exact match using regex
with data.universe(market='TSE_OTC', category=['^其他$']):
close_subset = data.get('price:收盤價')
# Exclude specific categories
with data.universe(market='TSE_OTC', category=['水泥工業'], exclude_category=['金融']):
price = data.get('price:收盤價')
```
**Global Setting:**
```python
from finlab import data
data.set_universe(market='TSE_OTC', category='水泥工業', exclude_category='金融')
price = data.get('price:收盤價')
```
---
## Data Catalog
### Price & Trading Data
| Table Name | Description | Available Columns |
|-----------|-------------|-------------------|
| `price` | 上市櫃市場成交資訊 | 成交股數, 成交筆數, 成交金額, 收盤價, 開盤價, 最低價, 最高價, 最後揭示買價, 最後揭示賣價, 最後揭示買量, 最後揭示賣量 |
| `etl` | 還原權值股價 | adj_close, adj_open, adj_high, adj_low |
| `intraday_odd_lot_trade` | 上市櫃盤中零股成交資訊 | 成交股數, 成交筆數, 成交金額, 收盤價, 開盤價, 最低價, 最高價, 最後揭示買價, 最後揭示賣價, 最後揭示買量, 最後揭示賣量 |
| `after_market_odd_lot_trade` | 上市櫃盤後零股成交資訊 | 成交股數, 成交筆數, 成交金額, 成交價, 最後揭示買價, 最後揭示賣價, 最後揭示買量, 最後揭示賣量 |
| `intraday_trading` | 現股當沖成交資訊 | 當日沖銷交易成交股數, 當日沖銷交易買進成交金額, 當日沖銷交易賣出成交金額, 得先賣後買當沖 |
| `rotc_price` | 興櫃市場成交資訊 | 成交股數, 成交金額, 開盤價, 收盤價, 最高價, 最低價, 日均價, 成交筆數, 最後揭示買價, 最後揭示賣價 |
### Valuation Metrics
| Table Name | Description | Available Columns |
|-----------|-------------|-------------------|
| `price_earning_ratio` | 個股日本益比、殖利率及股價淨值比 | 殖利率(%), 本益比, 股價淨值比 |
| `etl` | 個股市值 | market_value |
### Revenue Data
| Table Name | Description | Available Columns |
|-----------|-------------|-------------------|
| `monthly_revenue` | 上市櫃月營收 | 當月營收, 上月營收, 去年當月營收, 上月比較增減(%), 去年同月增減(%), 當月累計營收, 去年累計營收, 前期比較增減(%) |
| `rotc_monthly_revenue` | 興櫃月營收 | 當月營收, 上月營收, 去年當月營收, 上月比較增減(%), 去年同月增減(%), 當月累計營收, 去年累計營收, 前期比較增減(%), 備註 |
### Financial Statements
The `financial_statement` table contains comprehensive balance sheet, income statement, and cash flow data with 100+ columns including:
**Balance Sheet Items:**
- Assets: 現金及約當現金, 流動資產, 非流動資產, 資產總額
- Liabilities: 流動負債, 非流動負債, 負債總額
- Equity: 股本, 資本公積合計, 保留盈餘, 股東權益總額
**Income Statement Items:**
- Revenue & Costs: 營業收入淨額, 營業成本, 營業毛利
- Expenses: 研究發展費, 推銷費用, 管理費用
- Profit: 營業利益, 稅前淨利, 歸屬母公司淨利(損), 每股盈餘
**Cash Flow Items:**
- Operating: 營業活動之淨現金流入(流出)
- Investing: 投資活動之淨現金流入(流出), 取得不動產_廠房及設備
- Financing: 籌資活動之淨現金流入(流出), 發放現金股利
### Fundamental Features
The `fundamental_features` table contains 50+ calculated financial metrics:
**Profitability Metrics:**
- ROA稅後息前, ROA綜合損益, ROE稅後, ROE綜合損益
- 營業毛利率, 營業利益率, 稅前淨利率, 稅後淨利率
**Growth Metrics:**
- 營收成長率, 營業毛利成長率, 營業利益成長率, 稅前淨利成長率, 稅後淨利成長率
**Efficiency Metrics:**
- 總資產週轉次數, 應收帳款週轉率, 存貨週轉率, 固定資產週轉次數
**Liquidity Metrics:**
- 流動比率, 速動比率, 現金流量比率
**Leverage Metrics:**
- 負債比率, 總負債除總淨值
**Per Share Metrics:**
- 每股營業額, 每股營業利益, 每股現金流量, 每股稅前淨利, 每股綜合損益, 每股稅後淨利
### Institutional Trading
| Table Name | Description | Available Columns |
|-----------|-------------|-------------------|
| `institutional_investors_trading_summary` | 三大法人買賣超 | 外陸資買進股數(不含外資自營商), 外陸資賣出股數(不含外資自營商), 外陸資買賣超股數(不含外資自營商), 外資自營商買進股數, 外資自營商賣出股數, 外資自營商買賣超股數, 投信買進股數, 投信賣出股數, 投信買賣超股數, 自營商買進股數(自行買賣), 自營商賣出股數(自行買賣), 自營商買賣超股數(自行買賣), 自營商買進股數(避險), 自營商賣出股數(避險), 自營商買賣超股數(避險) |
| `foreign_investors_shareholding` | 外資持股比率 | 發行股數, 外資及陸資尚可投資股數, 全體外資及陸資持有股數, 外資及陸資尚可投資比率, 全體外資及陸資持股比率, 外資及陸資共用法令投資上限比率, 陸資法令投資上限比率 |
### Margin Trading
| Table Name | Description | Available Columns |
|-----------|-------------|-------------------|
| `margin_transactions` | 融資券 | 融資買進, 融資賣出, 融資現金償還, 融資前日餘額, 融資今日餘額, 融資限額, 融券買進, 融券賣出, 融券現券償還, 融券前日餘額, 融券今日餘額, 融券限額, 資券互抵, 註記, 融資使用率, 融券使用率 |
| `security_lending` | 借券 | 前日借券餘額, 借券, 借券還券, 借券增減, 借券餘額 |
| `security_lending_sell` | 借券賣出 | 借券賣出, 借券賣出還券, 借券賣出餘額, 借券賣出限額 |
### Insider Trading
| Table Name | Description | Available Columns |
|-----------|-------------|-------------------|
| `internal_equity_changes` | 內部人持股變化 | 發行股數, 董監增加股數, 董監減少股數, 董監持有股數, 董監持有股數占比, 經理人持有股數, 百分之十以上大股東持有股數, 市場別 |
| `internal_equity_pledge` | 內部人質押 | 董監持股, 董監設質, 董監解質, 董監累計設質, 董監設質股數占比, 經理人持股, 百分之十以上大股東持有股數, 經理人及百分之十以上大股東設質股數, 經理人及百分之十以上大股東設質股數占比, 市場別 |
| `inventory` | 集保餘額 | (Unstructured table - use directly) |
### Corporate Actions
| Table Name | Description | Available Columns |
|-----------|-------------|-------------------|
| `dividend_tse` | 上市除權息 | 除權息前收盤價, 除權息參考價, 權值+息值, 權息, 漲停價格, 跌停價格, 開盤競價基準, 減除股利參考價, 詳細資料, 最近一次申報資料 季別日期, 最近一次申報每股 (單位)淨值, 最近一次申報每股 (單位)盈餘, twse_divide_ratio |
| `dividend_otc` | 上櫃除權息 | 除權息前收盤價, 除權息參考價, 權值, 息值, 權+息值, 權息, 漲停價格, 跌停價格, 開盤競價基準, 減除股利參考價, 現金股利, 每千股無償配股, 現金增資股數, 現金增資認購價, 公開承銷股數, 員工認購股數, 原股東認購數, 按持股比例千股認購, otc_divide_ratio |
| `capital_reduction_tse` | 上市減資 | 恢復買賣日期, 減資原因, 恢復買賣參考價, 停止買賣前收盤價格, 漲停價格, 跌停價格, 開盤競價基準, 除權參考價, twse_cap_divide_ratio |
| `capital_reduction_otc` | 上櫃減資 | 恢復買賣日期, 減資原因, 開始交易基準價, 最後交易之收盤價格, 減資恢復買賣開始日參考價格, 漲停價格, 跌停價格, 除權參考價, otc_cap_divide_ratio |
| `treasury_stock` | 庫藏股 | 買回目的, 買回股份總金額上限, 預定買回股數, 買回價格區間-最低, 買回價格區間-最高, 預定買回期間-起, 預定買回期間-迄, 是否執行完畢, 本次已買回股數, 本次執行完畢已註銷或轉讓股數, 本次已買回股數佔預定買回股數比例(%), 本次已買回總金額, 本次平均每股買回價格, 本次買回股數佔公司已發行股份總數比例(%), 本次未執行完畢之原因 |
### Market Indices
| Table Name | Description | Available Columns |
|-----------|-------------|-------------------|
| `benchmark_return` | 回測基準 | 發行量加權股價報酬指數 |
| `taiex_total_index` | 發行量加權股價指數歷史資料 | 開盤指數, 最高指數, 最低指數, 收盤指數 |
| `stock_index_price` | 指數資訊 | 收盤指數, 漲跌百分比(%) |
| `stock_index_vol` | 指數成交量資訊 | 成交股數, 成交金額, 成交筆數 |
| `world_index` | 世界指數 | open, high, low, close, adj_close, volume |
### Economic Indicators
| Table Name | Description | Available Columns |
|-----------|-------------|-------------------|
| `tw_business_indicators` | 景氣指標 | 景氣對策信號(分), 領先指標綜合指數(點), 領先指標不含趨勢指數(點), 同時指標綜合指數(點), 同時指標不含趨勢指數(點), 落後指標綜合指數(點), 落後指標不含趨勢指數(點) |
| `tw_total_pmi` | 製造業採購經理人指數 | 製造業PMI, 新增訂單數量, 生產數量, 人力僱用數量, 供應商交貨時間, 存貨, 客戶存貨, 原物料價格, 未完成訂單, 新增出口訂單, 進口原物料數量, 未來六個月展望 |
| `tw_total_nmi` | 非製造業採購經理人指數 | 臺灣非製造業NMI, 商業活動, 新增訂單, 人力僱用, 供應商交貨時間, 存貨, 採購價格, 未完成訂單, 服務輸出出口, 服務輸入進口, 服務收費價格, 存貨觀感, 未來六個月展望 |
| `tw_monetary_aggregates` | 貨幣總計數年增率 | 年增率(%) |
### Futures Data
| Table Name | Description | Available Columns |
|-----------|-------------|-------------------|
| `futures_price` | 期貨日成交資訊 | 到期月份(週別), 開盤價, 最高價, 最低價, 收盤價, 漲跌價, 漲跌幅, 成交量, 未沖銷契約數 |
| `futures_institutional_investors_trading_summary` | 期貨三大法人盤後資訊 | 多方交易口數, 空方交易口數, 多空交易口數淨額, 多方未平倉口數, 空方未平倉口數, 多空未平倉口數淨額, 多方交易契約金額(千元), 空方交易契約金額(千元), 多空交易契約金額淨額(千元), 多方未平倉契約金額(千元), 空方未平倉契約金額(千元), 多空未平倉契約金額淨額(千元) |
### Convertible Bonds
| Table Name | Description | Available Columns |
|-----------|-------------|-------------------|
| `cb_price` | 可轉換公司債成交資訊 | 成交張數, 成交筆數, 成交金額, 收盤價, 開盤價, 最低價, 最高價 |
| `cb_converted_status` | 可轉換公司債每月轉換普通股 | 本月轉換張數, 轉(交)換或認股價格(元), 債券轉(交)換或認購普通股 |
### Company Information
| Table Name | Description |
|-----------|-------------|
| `company_basic_info` | 企業基本資訊 |
| `company_main_business` | 企業主要經營業務 |
| `important_subsidiary` | 企業重要子公司資訊 |
| `security_categories` | 證券分類 |
| `security_industry_themes` | 產業題材 |
**Stock ID to Name Mapping:**
```python
# company_basic_info 的 index 是流水號,需用 stock_id 欄位對應
info = data.get("company_basic_info")
name_map = dict(zip(info["stock_id"], info["公司簡稱"]))
# 用法: name_map.get("2330") -> "台積電"
df["股票名稱"] = df.index.map(lambda x: name_map.get(x, x))
```
### Special Status
| Table Name | Description | Usage |
|-----------|-------------|-------|
| `etl:disposal_stock_filter` | 排除處置股 | Boolean filter |
| `etl:noticed_stock_filter` | 排除注意股 | Boolean filter |
| `etl:full_cash_delivery_stock_filter` | 排除全額交割股 | Boolean filter |
| `trading_attention` | 注意股 | Status table |
| `disposal_information` | 處置股 | Status table |
| `change_transaction` | 上市櫃變更交易 | 變更交易, 分盤交易 |
### Special Dates
| Table Name | Description |
|-----------|-------------|
| `etl:financial_statements_deadline` | 財報截止日 |
| `etl:financial_statements_disclosure_dates` | 財報電子檔上傳日 |
| `financial_statements_upload_detail` | 財報電子檔上傳紀錄 |
---
## Plotting Data
Use `etl:adj_close` for historical comparison (backward adjusted, 向後還原).
```python
import matplotlib.pyplot as plt
from finlab import data, ffn_core
adj_close = data.get('etl:adj_close')
adj_close[['2330', '2317', '2454']].loc['2020':].rebase().plot(figsize=(12, 6))
plt.title('股價走勢比較')
plt.show()
```
---
## Storage Configuration
Control how data is cached locally. By default, FinLab caches data to disk to avoid repeated downloads.
### data.set_storage()
Configure the storage backend for data caching.
**Signature:**
```python
data.set_storage(storage: Storage) -> None
```
**Parameters:**
- `storage` (Storage, required): Storage backend instance. Options:
- `data.FileStorage()` - Disk-based storage (default)
- `data.CacheStorage()` - In-memory storage (faster, but lost on restart)
**Example:**
```python
from finlab import data
# Use in-memory storage (faster for repeated access in same session)
data.set_storage(data.CacheStorage())
# Use disk storage (default, persists across sessions)
data.set_storage(data.FileStorage())
```
### data.clear()
Clear all cached data from the current storage backend.
**Signature:**
```python
data.clear() -> None
```
**Example:**
```python
from finlab import data
# Clear all cached data
data.clear()
# Next data.get() will re-download from cloud
close = data.get('price:收盤價')
```
### Configuration Flags
Control data fetching behavior using module-level flags.
| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `data.use_local_data_only` | bool | False | Prevent cloud downloads; use only local cache. Raises error if data not cached. |
| `data.force_cloud_download` | bool | False | Always download fresh data from cloud, ignoring cache. |
| `data.prefer_local_if_exists` | bool | False | Use local cache without checking expiry. Faster startup. |
| `data.truncate_start` | str/None | None | Filter data to start from this date (format: `'YYYY-MM-DD'`). |
| `data.truncate_end` | str/None | None | Filter data to end at this date (format: `'YYYY-MM-DD'`). |
**Examples:**
```python
from finlab import data
# Offline mode - use only local cache
data.use_local_data_only = True
close = data.get('price:收盤價') # Fails if not cached
# Force fresh download
data.force_cloud_download = True
close = data.get('price:收盤價') # Always downloads
# Use local cache without expiry check (faster)
data.prefer_local_if_exists = True
close = data.get('price:收盤價')
# Limit data range
data.truncate_start = '2020-01-01'
data.truncate_end = '2023-12-31'
close = data.get('price:收盤價') # Only 2020-2023 data
# Reset to defaults
data.use_local_data_only = False
data.force_cloud_download = False
data.prefer_local_if_exists = False
data.truncate_start = None
data.truncate_end = None
```
### Data Freshness Check
Verify cached data is up-to-date before executing trades.
```python
from finlab import data
# Check if all cached data is fresh (not expired, no pending crawl updates)
if data.is_tradable():
print("Data is fresh — safe to trade")
else:
print(f"Wait until: {data.suggested_tradable_time()}")
# Check next cache expiry time (UTC)
print(data.next_future_expiry())
```
| Function | Returns | Description |
|----------|---------|-------------|
| `data.is_tradable(market=None)` | `bool` | `True` when all cached data is not expired AND no more crawl updates expected before next market open |
| `data.suggested_tradable_time(market=None)` | `datetime\|None` | When `is_tradable()` will become `True`. `None` if already tradable |
| `data.next_future_expiry()` | `datetime\|None` | Next future expiry time (UTC) among cached datasets |
These are also available as Report methods: `report.is_tradable()`, `report.suggested_tradable_time()`, `report.next_future_expiry()`.
---
## Related References
- [FinlabDataFrame Reference](dataframe-reference.md) - Enhanced DataFrame methods for data manipulation
- [Backtesting Reference](backtesting-reference.md) - How to use data in backtesting
- [Factor Examples](factor-examples.md) - Practical examples using various datasets
- [Factor Analysis Reference](factor-analysis-reference.md) - Analyze factor effectiveness