Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Data Cog

AI data analysis and visualization powered by CellCog. Upload CSVs and get charts, dashboards, statistical reports, and clean data back. Data cleaning, explo...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
4 · 1.6k · 7 current installs · 7 all-time installs
byCellCog@nitishgargiitd
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The name/description (data analysis & visualization) align with the runtime instructions: prompts that reference <SHOW_FILE>/path/to/data.csv and examples using client.create_chat to run analyses. The skill explicitly depends on the 'cellcog' skill for SDK setup/API calls, which is coherent for a remote analysis service. Note: the real authentication and install behavior are delegated to the cellcog skill, which is not provided here.
!
Instruction Scope
The SKILL.md instructs agents/users to upload complete CSVs (via <SHOW_FILE> tags) and promises 'Full Python access' and that 'CellCog runs the code for you.' It's unclear whether code executes server-side (CellCog) or client-side/local Python, and there is no disclosure about data transmission, storage, or retention. Uploading entire datasets (potentially sensitive) to an external service is central to the skill but not explicitly qualified or limited in the instructions.
Install Mechanism
This skill has no install spec itself, which reduces direct install risk. However, it tells the user/agent to run 'clawhub install cellcog' and to 'Read the cellcog skill first' — the security posture therefore depends on that other skill's install steps and any binaries it pulls in. Because that dependency isn't inspected here, you should review the cellcog skill/install before proceeding.
!
Credentials
This skill declares no required environment variables or credentials, but it relies on the separate cellcog skill to handle SDK setup/auth. The SKILL.md does not explain what credentials are needed, where data is sent, or how it will be protected. Requesting full datasets for server-side analysis is proportionate to data-analysis functionality but raises privacy/exfiltration concerns unless the dependent skill's auth and retention policies are transparent.
Persistence & Privilege
The skill does not request 'always: true' or other elevated platform privileges; it is user-invocable and allows normal autonomous invocation. It does not declare config paths or persistent presence beyond depending on the cellcog skill.
What to consider before installing
This skill appears to be a front-end for the CellCog service and will send uploaded CSVs to that service for processing. Before installing or using it: (1) inspect the referenced 'cellcog' skill and its install steps to see what code/binaries and environment variables it requires; (2) confirm whether analysis runs on CellCog servers or locally and review CellCog's privacy/retention policy and data residency; (3) avoid uploading sensitive or regulated data until you understand authentication, storage, and sharing; (4) if you must test, start with synthetic or anonymized datasets; (5) ask the publisher to clarify what 'Full Python access' means (server-side sandbox vs. executing code on your machine) and whether uploaded files are retained or shared.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.2
Download zip
latestvk97ehpdw42v8yqnbn6vbn3bsgs83p0sd

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

🔢 Clawdis
OSmacOS · Linux · Windows

SKILL.md

Data Cog - Your Data Has Answers, CellCog Finds Them

Your data has answers. CellCog asks the right questions. #1 on DeepResearch Bench (Feb 2026) + frontier coding agent.

Most AI tools return code when you ask about data. CellCog returns answers — actual charts, clean datasets, statistical reports, and visual dashboards. Upload messy CSVs with a minimal prompt, and CellCog's coding agent explores your data, finds the patterns, and presents them beautifully. Full Python access for everything from data cleaning to ML model evaluation.


Prerequisites

This skill requires the cellcog skill for SDK setup and API calls.

clawhub install cellcog

Read the cellcog skill first for SDK setup. This skill shows you what's possible.

Quick pattern (v1.0+):

# Fire-and-forget - returns immediately
result = client.create_chat(
    prompt="Analyze this dataset: <SHOW_FILE>/path/to/data.csv</SHOW_FILE>",
    notify_session_key="agent:main:main",
    task_label="data-analysis",
    chat_mode="agent"  # Agent mode for most data work
)
# Daemon notifies you when complete - do NOT poll

What Makes Data-Cog Different

Code as Tool, Not as Output

Other AI tools give you Python code and say "run this." CellCog runs the code for you and delivers the results:

Other AI ToolsData-Cog
"Here's a pandas script to analyze your data"Here are your actual insights with charts
"Run this matplotlib code to see the chart"Here's the chart, annotated with findings
"This SQL query will find outliers"Found 23 outliers, here's what they mean
"You'll need scikit-learn for this"Model trained, here's accuracy and feature importance

You upload data. You get answers. The code runs behind the scenes.


What Data Work You Can Do

Exploratory Data Analysis

Understand your data fast:

  • Dataset Profiling: "Analyze this CSV — distributions, missing values, outliers, correlations, and data quality summary"
  • Pattern Discovery: "What patterns and trends exist in this sales data? Surprise me."
  • Anomaly Detection: "Find unusual patterns in this server log data — what looks abnormal?"
  • Relationship Analysis: "What factors most strongly correlate with customer churn in this dataset?"

Example prompt:

"Analyze this dataset: <SHOW_FILE>/path/to/customer_data.csv</SHOW_FILE>

I don't know much about this data yet. Give me:

  • Overview: rows, columns, data types, missing values
  • Key distributions and summary statistics
  • Most interesting correlations
  • Any outliers or data quality issues
  • 3-5 insights that jump out

Present findings as an interactive HTML report with charts."

Data Cleaning & Transformation

Wrangle messy data into shape:

  • Clean Messy Data: "Clean this CSV — fix inconsistent date formats, handle missing values, remove duplicates, standardize column names"
  • Data Transformation: "Pivot this transaction data into a monthly summary by product category"
  • Data Merging: "Join these three CSV files on customer_id and create a unified dataset"
  • Feature Engineering: "Create useful features from this raw data for predicting house prices"

Example prompt:

"Clean and transform this dataset: <SHOW_FILE>/path/to/messy_data.csv</SHOW_FILE>

Issues I know about:

  • Dates are in mixed formats (MM/DD/YYYY and YYYY-MM-DD)
  • 'Revenue' column has some values with $ signs and commas
  • Duplicate rows exist
  • Missing values in 'Region' column

Clean it up and give me back a clean CSV plus a summary of what you changed."

Statistical Analysis

Rigorous analysis with real numbers:

  • Hypothesis Testing: "Is there a statistically significant difference in conversion rates between our A and B variants?"
  • Regression Analysis: "What factors predict employee salary in this HR dataset? Build a regression model."
  • Time Series Analysis: "Analyze this monthly revenue data — trend, seasonality, and forecast next 6 months"
  • Cohort Analysis: "Create a cohort analysis showing user retention by signup month"

Example prompt:

"I ran an A/B test on our checkout page: <SHOW_FILE>/path/to/ab_test_results.csv</SHOW_FILE>

Columns: user_id, variant (A or B), converted (0/1), revenue, timestamp

Tell me:

  • Is variant B statistically better? (p-value, confidence interval)
  • Conversion rate difference
  • Revenue per user difference
  • Sample size adequacy check
  • My recommendation: ship B or keep testing?

Present with clear charts and a plain-English conclusion."

Visualization & Reporting

Turn data into visual stories:

  • Chart Generation: "Create a set of charts showing our quarterly performance from this data"
  • Dashboard Reports: "Build an interactive dashboard from this sales dataset with filters by region and product"
  • Presentation-Ready Visuals: "Create publication-quality charts from this research data"
  • Comparison Visuals: "Visualize how our metrics compare to industry benchmarks"

Machine Learning

Applied ML without the setup:

  • Classification: "Predict which customers will churn based on this dataset — train a model, show feature importance"
  • Clustering: "Segment these customers into groups based on behavior — how many natural clusters exist?"
  • Forecasting: "Forecast next quarter's sales using this historical data"
  • Model Evaluation: "I trained a model — here are the predictions. Evaluate: accuracy, precision, recall, confusion matrix, ROC curve"

Example prompt:

"Predict customer churn from this dataset: <SHOW_FILE>/path/to/customer_features.csv</SHOW_FILE>

Target column: 'churned'

  • Train a model, try at least 2 algorithms
  • Show feature importance — what drives churn?
  • Confusion matrix and ROC curve
  • Plain-English summary: 'The top 3 reasons customers churn are...'
  • Actionable recommendations based on findings

I want insights, not just metrics."


Supported Data Formats

FormatHow to Send
CSVUpload via SHOW_FILE
Excel (XLSX)Upload via SHOW_FILE
JSONUpload via SHOW_FILE
ParquetUpload via SHOW_FILE
SQL exportsUpload the dump via SHOW_FILE
Inline dataDescribe small datasets directly in prompt

Output Formats

FormatBest For
Interactive HTML DashboardExplorable charts, filters, drill-downs
PDF ReportShareable analysis reports with charts and findings
Clean CSV/XLSXCleaned or transformed data files for downstream use
MarkdownQuick insights for integration into docs

Chat Mode for Data

ScenarioRecommended Mode
Quick data cleaning, simple charts, basic statistics"agent"
Deep analysis with multiple techniques, ML modeling, comprehensive reports"agent team"

Use "agent" for most data work. Data cleaning, EDA, chart generation, and standard statistical analysis execute well in agent mode.

Use "agent team" for complex analytical projects — multi-technique analysis, ML model comparisons, or when you need deep domain reasoning about what the data means.


Example Prompts

Minimal prompt, maximum insight:

"Analyze this: <SHOW_FILE>/path/to/data.csv</SHOW_FILE>

Tell me everything interesting."

That's it. CellCog's coding agent will profile the data, run exploratory analysis, find patterns, and present findings with charts. You don't need to know what to ask — the agent figures it out.

Business analysis:

"Analyze our e-commerce data: <SHOW_FILE>/path/to/orders.csv</SHOW_FILE>

I need:

  • Revenue trends (daily, weekly, monthly)
  • Best and worst performing products
  • Customer purchase frequency distribution
  • Average order value trends
  • Seasonal patterns
  • Top 5 actionable insights for growing revenue

Interactive HTML dashboard with all charts."

Research data analysis:

"Analyze this survey data from 500 respondents: <SHOW_FILE>/path/to/survey.csv</SHOW_FILE>

Research questions:

  1. Is there a significant relationship between age group and product preference?
  2. Do satisfaction scores differ by region? (ANOVA)
  3. What factors best predict likelihood to recommend? (regression)

Include: statistical tests, p-values, effect sizes, and publication-ready charts. PDF report format."


Tips for Better Data Analysis

  1. Just upload and ask: You don't need to describe every column. CellCog reads the data and figures out what's there.

  2. State your question: "What drives churn?" is more focused than "Analyze this data." Both work, but the first gets faster results.

  3. Mention the audience: "For my CEO" means executive summary. "For the data team" means show the methodology.

  4. Specify what you'll do with it: "I need to present this to the board" vs "I need clean data for my ML pipeline" — context shapes the output.

  5. Don't over-specify methods: Let CellCog choose the right statistical approach. Say what you want to learn, not which algorithm to use.

  6. Iterate: Upload data → get initial analysis → ask follow-up questions → go deeper. CellCog maintains context across messages.

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…