Install
openclaw skills install data-cleanerClean and standardize multi-format data with AI-powered deduplication, missing value fill, format normalization, multi-source merge, and Feishu-native output.
openclaw skills install data-cleanerUpload messy data — get clean, structured output. Supports multi-format parsing, AI field identification, intelligent dedup/fill/formatting, multi-source join, and Feishu-native output (Bitable + quality report doc).
Use cases: E-commerce order cleanup, CRM customer data cleansing, bank statement reconciliation, roster cleanup, multi-system data merge.
1xx-xxxx-xxxxYYYY-MM-DD| Feature | Free | Basic | Standard | Pro |
|---|---|---|---|---|
| Multi-format parsing | ✅ | ✅ | ✅ | ✅ |
| Basic dedup | ✅ | ✅ | ✅ | ✅ |
| Monthly rows | 50 | 500 | 3,000 | Unlimited |
| Data sources | 1 | 3 | Unlimited | Unlimited |
| Smart fill | ❌ | ❌ | ✅ | ✅ |
| Format standardization | ❌ | ❌ | ✅ | ✅ |
| Fuzzy dedup | ❌ | ❌ | ✅ | ✅ |
| Multi-source merge | ❌ | ❌ | ❌ | ✅ |
| AI classification | ❌ | ❌ | ❌ | ✅ |
| Data quality report | ❌ | ❌ | ❌ | ✅ |
| Feishu Bitable output | ❌ | ❌ | ❌ | ✅ |
| Tier | Price | Monthly Rows | Sources |
|---|---|---|---|
| Free | ¥0 | 50 | 1 |
| Basic | ¥29/mo | 500 | 3 |
| Standard | ¥99/mo | 3,000 | Unlimited |
| Pro | ¥299/mo | Unlimited | Unlimited |
data cleaning
deduplication
spreadsheet cleanup
CRM data cleanup
Excel cleaning
python scripts/main.py clean -i data.xlsx -o cleaned.xlsx
python scripts/main.py clean -t "name,phone\nJohn,13800138000" -f csv -o cleaned.csv
python scripts/main.py merge --sources customers.xlsx orders.csv --on phone -o merged.xlsx
from main import run_clean_pipeline
result = run_clean_pipeline(
sources=["orders.xlsx"],
output_format="xlsx",
output_path="/tmp/cleaned.xlsx",
dedup_strategy="auto",
fill_strategy="auto",
classify=True,
ai_model="deepseek",
generate_report=True,
)
| Variable | Required | Description |
|---|---|---|
DATA_CLEANER_API_KEY | For AI features | MiniMax or DeepSeek API Key |
DATA_CLEANER_TIER | Recommended | Subscription tier (free/basic/std/pro) |
multi-source-data-cleaner/
├── SKILL.md
├── README.md
├── scripts/
│ ├── main.py # Entry: run_clean_pipeline / run_merge_pipeline
│ ├── parser.py # F1: Multi-format parsing
│ ├── field_identifier.py # F2: AI field identification
│ ├── cleaner.py # F3: Cleaning engine
│ ├── classifier.py # F4: Classification / tagging
│ ├── merger.py # F5: Multi-source join
│ ├── reporter.py # F6: Quality report generation
│ ├── output.py # F6: Output (Excel/CSV/Bitable/Feishu Doc)
│ └── tier_limits.py # Tier access control + API key verification
└── tests/
├── test_parser.py
├── test_cleaner.py
└── test_field_identifier.py
MIT
For paid plans, visit YK-Global.com