Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Data Cleaner AI

v1.0.3

Cleans and deduplicates multi-format data with AI field detection, format standardization, multi-source merging, and outputs Excel, CSV, or Feishu Bitable.

0· 73·0 current·0 all-time
byYK-Global@billjamno58

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for billjamno58/data-cleaner-ai.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Data Cleaner AI" (billjamno58/data-cleaner-ai) from ClawHub.
Skill page: https://clawhub.ai/billjamno58/data-cleaner-ai
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Canonical install target

openclaw skills install billjamno58/data-cleaner-ai

ClawHub CLI

Package manager switcher

npx clawhub@latest install data-cleaner-ai
Security Scan
Capability signals
CryptoCan make purchasesRequires sensitive credentials
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
!
Purpose & Capability
The skill advertises Feishu Bitable write-back and AI-driven features, which plausibly require external credentials, but the registry metadata at the top lists "Required env vars: none" and "Primary credential: none" — inconsistent with SKILL.md and the included code. The SKILL.md lists OPENAI_API_KEY, SKILL_BILLING_API_KEY and FEISHU_USER_ID; the code implements billing (skillpay.me) and AI field identification/classification, so those environment variables make sense — but Feishu write-back would normally require Feishu API credentials (app token/client secret) which are not declared. This mismatch between claimed metadata and actual requirements is a red flag.
!
Instruction Scope
Runtime invokes scripts/main.py which orchestrates parsing, cleaning, AI classification, billing, reporting and Feishu output. The SKILL.md and code indicate that data will be sent to AI APIs (OPENAI_API_BASE / OPENAI_API_KEY) for field identification/classification but the doc does not explicitly describe what parts of the user's dataset are transmitted or how sensitive fields are handled. Billing transmits the FEISHU_USER_ID to skillpay.me. The instructions grant broad discretion (AI auto-detection, semantic inference) that could cause sensitive PII to be sent to external AI endpoints; this is not clearly documented or consented in the SKILL.md.
Install Mechanism
There is no external install spec or downloads; the skill runs local Python scripts bundled in the package. No evidence of fetching code from untrusted URLs or executing remote installers. That reduces supply-chain risk compared to arbitrary downloads, but the included code will execute locally and may make outbound network calls.
!
Credentials
SKILL.md requests multiple sensitive environment variables: OPENAI_API_KEY / OPENAI_API_BASE (AI inference), SKILL_BILLING_API_KEY (SkillPay billing), SKILL_BILLING_SKILL_ID and FEISHU_USER_ID. These map to the skill's features, but they are not listed in the registry metadata (incoherent). Additionally, Feishu write-back capability would normally require Feishu API credentials (app tokens / access tokens), but no such variables are declared; either the feature is incomplete or it expects implicit platform-level Feishu credentials (not disclosed). Billing.py also implements a "dev mode" where missing billing API key or network errors cause the code to 'fail open' and allow operation without charge; that behavior should be explicit to users because it affects billing correctness and abuse risk.
Persistence & Privilege
The skill does not request always:true and does not claim to modify other skills or global agent settings. It runs as a subprocess (scripts/main.py). Autonomous invocation is allowed (platform default) but not combined with excessive declared privileges here; that's acceptable but be aware an autonomously-invoked skill can still exfiltrate data if given API keys.
What to consider before installing
This skill includes runnable Python code that will call external services: a billing endpoint (skillpay.me) and AI APIs (OPENAI_API_BASE) for field detection/classification. Before installing or providing credentials: - Verify the publisher and source: registry metadata shows "Source: unknown" and no homepage. Prefer packages with a known maintainer and repository. - Ask the author to explain the credential gap: the registry shows no required env vars but SKILL.md requires OPENAI_API_KEY, SKILL_BILLING_API_KEY and FEISHU_USER_ID. Also ask how Feishu write-back authenticates — which Feishu credentials are needed? - Review scripts/main.py (entry point) to confirm exactly what is sent to external AI endpoints. The package likely sends dataset samples/columns to the AI service; do not run it on datasets with sensitive PII unless you accept that data will be transmitted to the configured AI provider. - Be cautious with SKILL_BILLING_API_KEY: if you provide a builder billing key, ensure you understand who controls billing and whether that key is a secret of the publisher or of your environment. Note the billing module will "fail open" on network errors or missing key and allow execution without charging — that may be intended for development but should be explicit. - If you need the Feishu output feature, require the author to document required Feishu credentials and an explicit privacy statement for what is posted to Feishu Bitable/Docs. - If you lack the ability to audit network calls, run the skill in a sandboxed environment and monitor outbound traffic (to detect unexpected endpoints) or request an official repository with a clear maintainer and security/privacy documentation. Given the inconsistencies and external network activity, proceed only after clarifications or a code-level review; treat the skill as suspicious until you can confirm the data flows and credentials required.

Like a lobster shell, security has layers — review code before you run it.

latestvk97frnrgtw721fmga1924xfzzs85eyza
73downloads
0stars
4versions
Updated 3d ago
v1.0.3
MIT-0

Data Cleaner AI

Upload messy data — get clean, structured output. Supports multi-format parsing, AI field identification, intelligent dedup/fill/formatting, multi-source join, and Feishu-native output (Bitable + quality report doc).

Use cases: E-commerce order cleanup, CRM customer data cleansing, bank statement reconciliation, roster cleanup, multi-system data merge.


Capabilities

F1 · Multi-Format Parsing

  • Excel (.xlsx / .ls)
  • CSV / TSV
  • JSON (semi-structured)
  • Clipboard paste text

F2 · Smart Field Identification

  • AI auto-detects: name, phone, email, address, amount, date, SKU, order ID, ID number, gender, etc.
  • Supports user-defined field mapping override

F3 · Data Cleaning

  • Deduplication: Exact match + fuzzy dedup (FuzzyWuzzy, threshold 88%)
  • Missing value fill: Mean / mode / semantic inference / leave blank
  • Format standardization:
    • Phone → 1xx-xxxx-xxxx
    • Date → YYYY-MM-DD
    • Amount → 2 decimal places
    • Address → Province/City/District/Street standardization

F4 · Data Classification / Tagging (PRO)

  • 8 built-in business rules (high-value customer, dormant user, VIP, enterprise, etc.)
  • Supports custom JSON rules
  • AI auto-tagging (requires PRO + AI API Key)

F5 · Multi-Source Join / Merge (PRO)

  • Cross-file relational join on key fields
  • Fuzzy join when exact key not available (FuzzyWuzzy)
  • Conflicted field resolution: priority by source order or latest timestamp

F6 · Feishu Native Output

  • Excel / CSV export
  • Feishu Bitable (multi-dimensional table) write-back
  • Data quality report auto-generated as Feishu Doc (Markdown)

Tier Feature Matrix

FeatureFREEPRO
Multi-format parsing
Basic dedup
Smart fill
Format standardization
Fuzzy dedup
Multi-source merge
AI classification
Data quality report
Feishu Bitable output

Pricing

Per-call billing (no monthly fee):

TierPrice per Call
FREE$0.00 USDT
PRO$0.01 USDT

Each cleaning pipeline execution (clean or merge) = one billable call.


Usage

Feishu Trigger

data cleaning
deduplication
spreadsheet cleanup
CRM data cleanup
Excel cleaning

CLI

python scripts/main.py clean -i data.xlsx -o cleaned.xlsx
python scripts/main.py clean -t "name,phone\nJohn,13800138000" -f csv -o cleaned.csv
python scripts/main.py merge --sources customers.xlsx orders.csv --on phone -o merged.xlsx

Python API

from main import run_clean_pipeline

result = run_clean_pipeline(
    sources=["orders.xlsx"],
    output_format="xlsx",
    output_path="/tmp/cleaned.xlsx",
    dedup_strategy="auto",
    fill_strategy="auto",
    classify=True,
    ai_model="deepseek",
    generate_report=True,
)

Directory Structure

data-cleaner-ai/
├── SKILL.md
└── scripts/
    ├── main.py              # Entry: run_clean_pipeline / run_merge_pipeline
    ├── parser.py            # F1: Multi-format parsing
    ├── field_identifier.py  # F2: AI field identification
    ├── cleaner.py           # F3: Cleaning engine
    ├── classifier.py        # F4: Classification / tagging
    ├── merger.py            # F5: Multi-source join
    ├── reporter.py          # F6: Quality report generation
    ├── output.py            # F6: Output (Excel/CSV/Bitable/Feishu Doc)
    ├── tier_limits.py       # Tier access control
    └── billing.py           # SkillPay billing integration

Billing

This skill uses SkillPay (skillpay.me) for per-call billing.

Fee: $0.0100 USDT per execution (all paid tiers) External API: https://skillpay.me/api/v1/billing Data transmitted: User identifier (FEISHU_USER_ID environment variable)

Billing occurs at the start of each cleaning or merge execution. If balance is insufficient, the tool returns a payment_url where the user can recharge.


Required Environment Variables

VariableDescription
FEISHU_USER_IDFeishu user open_id for billing identification
OPENAI_API_KEYAI model API key (OpenAI, MiniMax, or OpenAI-compatible endpoint)
OPENAI_API_BASEBase URL for AI API (optional, defaults to MiniMax endpoint)
SKILL_BILLING_API_KEYBuilder API Key from skillpay.me (required for paid calls)
SKILL_BILLING_SKILL_IDSkill slug on SkillPay (defaults to data-cleaner-ai)

Error Handling

ErrorHandling
Balance insufficientReturn payment_url for recharge
Network error on billingAllow call through in dev mode (no charge)
Tier feature not availableSkip feature gracefully, continue with available features
No data source providedRaise error requesting input

License

MIT

Comments

Loading comments...