Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

universal-data-analyst-en

v1.0.3

Performs automated, LLM-driven data analysis including loading, validation, method selection, script generation, execution, and comprehensive reporting for d...

0· 145·0 current·0 all-time
byyamaz@yamaz49

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for yamaz49/universal-data-analyst-en.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "universal-data-analyst-en" (yamaz49/universal-data-analyst-en) from ClawHub.
Skill page: https://clawhub.ai/yamaz49/universal-data-analyst-en
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install universal-data-analyst-en

ClawHub CLI

Package manager switcher

npx clawhub@latest install universal-data-analyst-en
Security Scan
Capability signals
CryptoCan make purchases
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Name/description match the included modules (data loader, validator, orchestrator, LLM prompt generator, report generator). The code produces prompts for LLMs and coordinates a multi-step analysis pipeline, which is coherent with the stated purpose. Minor mismatch: README/SKILL.md includes example calls to an LLM client (Anthropic/Claude) but the shipped LLM module returns prompts rather than performing network calls and the skill declares no required env vars/credentials for LLM access.
!
Instruction Scope
The orchestrator generates full Python analysis scripts (via LLM prompts) and then executes them (the orchestrator imports subprocess and contains step execution logic). Executing code generated by an LLM on the user's machine is expected for this tool's purpose but is a high-risk action: generated scripts can contain arbitrary file I/O, shell/OS calls, or network operations and thus may exfiltrate data or modify the system. The SKILL.md and code instruct saving prompt files and calling an LLM externally — but the skill also supports an autonomous flow that can generate and run code. There are no enforced sandboxing or restrictions in the provided code.
Install Mechanism
There is no install spec (instruction-only skill with packaged Python modules). Nothing is downloaded at install time, so no arbitrary remote code is pulled during installation. The runtime will write output and prompt files to local output directories.
Credentials
The skill declares no required environment variables or credentials. However, documentation/examples reference calling external LLM APIs (Anthropic/Claude) which would require API keys if you choose to integrate — these keys are not managed by the skill. The shipped code itself does not appear to read unrelated system credentials or config paths.
Persistence & Privilege
always:false and no special persistence or modifications to other skills/configs. The skill creates session/output directories within the working directory; it does not request or claim system-wide privileges. Autonomous invocation is allowed by platform default, which combined with script execution increases blast radius but is not itself an unusual setting.
What to consider before installing
This skill is coherent with its stated purpose, but it generates Python analysis scripts via LLM prompts and can execute those scripts locally. Before installing or running: 1) Do NOT run this on sensitive or production systems without reviewing generated scripts first. 2) Inspect any generated analysis_script.py for network, subprocess, or filesystem operations (look for imports like requests, socket, subprocess, os.system, eval/exec, urllib, ftplib, paramiko). 3) Prefer running the orchestration and script execution inside an isolated environment (ephemeral VM, container, or sandbox) with limited network and file access. 4) If you will call external LLMs, keep API keys separate and only use trusted endpoints; the skill does not manage credentials. 5) Consider using the human-in-the-loop mode (generate prompts and scripts but manually review/execute) rather than fully autonomous execution. If you want me to, I can: (a) scan the full repository for occurrences of subprocess/requests/os.system/eval/exec/network endpoints, or (b) point to specific lines/functions to review before running.

Like a lobster shell, security has layers — review code before you run it.

latestvk977e8k81ad36c0kgs94265tj9848544
145downloads
0stars
4versions
Updated 3w ago
v1.0.3
MIT-0

Universal Data Analyst (通用数据分析专家)

Introduction

An intelligent data analysis skill based on Data Ontology. Unlike keyword-based approaches, this skill uses LLM reasoning for every analysis, automatically identifying data types, selecting analysis methods, generating scripts, and outputting reports.

Supports both economic data (retail, subscription, finance, etc.) and non-economic data (scientific measurements, social networks, text, etc.), handling multiple formats including CSV, Excel, Parquet, JSON, and more.


How to Trigger

Simply upload a data file or send any of these types of messages:

  • "Help me analyze this data"
  • "What patterns are in this CSV?"
  • "Explore this dataset"
  • "Check the data quality for me"
  • Directly upload .csv / .xlsx / .parquet / .json files

Core Design: Four-Layer Analysis Framework

Layer 1: Data Ontology
        ↓  What kind of existence is this? Entity type? Generation mechanism?
Layer 2: Problem Typology
        ↓  Descriptive / Diagnostic / Predictive / Prescriptive / Causal?
Layer 3: Methodology Mapping
        ↓  Match domain-recognized analysis frameworks
Layer 4: Validation & Output
           Data quality report + Analysis scripts + HTML/MD reports

Each layer invokes LLM reasoning without any hardcoded rules.


Analysis Workflow (7 Steps)

StepContentDescription
1Data LoadingAuto-recognize formats, support multiple file types
2Ontology RecognitionLLM judges entity type and generation mechanism
3Quality ValidationAuto-detect missing values, outliers, duplicates, output quality score
4Plan GenerationLLM selects analysis framework and path based on user intent
5Script GenerationLLM generates executable Python analysis scripts
6Execute AnalysisRun scripts, generate charts and numerical results
7Comprehensive ReportOutput HTML + Markdown dual-format reports

Flow Health Monitoring (NEW)

Each step has status tracking and error handling:

  • Step Dependency Check - Automatically prevents subsequent steps when prerequisites fail
  • Clear Error Messages - Provides explicit failure reasons and fix suggestions
  • Flow Health Report - Outputs complete execution status and issue summary

If a step fails, you'll see:

⚠️ Flow Interrupted!
   Reason: Critical step 'Data Loading' failed: Encoding error

Fix Suggestions:
  1. File encoding may not be UTF-8, try manually specifying encoding parameter
  2. Common Chinese encodings: gbk, gb2312, gb18030

Supported Data Types

Economic Data

Data CharacteristicsRecognized AsAuto-matched Framework
Orders + Price + SKURetail EconomyValue Chain / ABC-XYZ / RFM
User + Subscription Cycle + ChurnSubscription EconomyLTV / Cohort / Retention Curves
Click / Add-to-cart / Purchase EventsAttention EconomyFunnel Analysis / AARRR
GMV + Platform MatchingCommission EconomyTwo-sided Network Effects / Unit Economics
Position + Skills + SalaryLabor MarketSkill Premium / Experience Elasticity
OHLCV Price DataFinancial Time SeriesTechnical Analysis / Volatility Models

Non-Economic Data

Data TypeAuto-matched Framework
Sensors / Time Series ContinuousTime Series Decomposition, Extreme Value Analysis
Social / Network RelationshipCentrality Analysis, Community Detection
Geographic / SpatialSpatial Autocorrelation, Hotspot Analysis
Text CorpusTopic Modeling, Sentiment Analysis
BiomedicalSurvival Analysis, Differential Expression

Supported File Formats

  • CSV / TSV (.csv, .tsv, .txt) - Auto encoding detection, supports utf-8, gbk, latin1, etc.
  • Excel (.xlsx, .xls)
  • Parquet (.parquet, .pq)
  • JSON (.json)
  • SQL Database (via connection string)

Encoding Fault Tolerance

CSV loading automatically tries multiple encodings:

  • Auto encoding detection (if chardet library available)
  • Fallback encodings: utf-8, utf-8-sig, gbk, gb2312, gb18030, latin1, etc.
  • Engine fallback: Auto-switches to Python engine when C engine fails, skipping corrupted rows

Output Contents

Each analysis generates:

session_YYYYMMDD_HHMMSS/
├── step2_ontology_prompt.txt     # Ontology recognition prompts (reusable)
├── step3_validation_report.json  # Data quality report
├── step3_cleaning_report.txt     # Data cleaning recommendations
├── step4_planning_prompt.txt     # Analysis planning prompts (reusable)
├── step5_script_prompt.txt       # Script generation prompts (reusable)
├── analysis_report.html          # Comprehensive HTML report (with charts)
├── analysis_report.md            # Markdown report
└── charts/                       # All analysis charts (PNG)

Usage Examples

Example 1: Analyzing E-commerce Sales Data

User: Help me analyze this sales data, want to know which products sell well and which customers are high-value

[Upload orders.csv]

Skill automatically:

  1. Recognizes as "Retail Economy × Transaction/Event Data"
  2. Selects RFM Customer Value Analysis + ABC Product Classification framework
  3. Generates and executes analysis scripts
  4. Outputs customer segmentation distribution, product sales ranking, RFM heatmap, and HTML report

Example 2: Analyzing User Behavior Logs

User: This is our App's user behavior log, want to see the user conversion funnel

[Upload events.csv]

Skill automatically:

  1. Recognizes as "Attention/Conversion Economy × Event Sequence Data"
  2. Selects Funnel Analysis + Session Sequence Mining framework
  3. Outputs conversion rates at each step, churn node analysis, user path Sankey diagram

Example 3: Analyzing Meteorological Observation Data

User: Help me analyze this weather station observation record, understand temperature and precipitation patterns

[Upload weather.csv]

Skill automatically:

  1. Recognizes as "Earth Science × Time Series/Trajectory Data × Instrument Measurement"
  2. Selects Time Series Decomposition + Seasonality Analysis + Extreme Value Statistics framework
  3. Outputs trend charts, seasonal decomposition charts, outlier reports

Dependencies

pandas >= 1.3
numpy >= 1.21
matplotlib >= 3.4
seaborn >= 0.11
scipy >= 1.7
openpyxl >= 3.0   # Excel support
chardet >= 4.0    # Auto encoding detection (optional but recommended)
pyarrow >= 6.0    # Parquet support (optional)
sqlalchemy >= 1.4 # SQL support (optional)

Version

v1.1.0 · Author: Claude · License: CC BY-NC-SA 4.0

v1.1.0 Updates (2026-03-23)

  1. Flow Health Monitoring - Added step status tracking, dependency checks, error messages
  2. Enhanced Encoding Fault Tolerance - Auto-try multiple encodings for CSV/TSV (utf-8, gbk, latin1, etc.)
  3. Engine Fallback - Auto-switches to Python engine when C engine fails, skipping corrupted rows

v1.0.0

  • Initial version: Four-layer analysis framework + 7-step analysis workflow

Comments

Loading comments...