Dataset Finder
v0.1.0Use this skill when users need to search for datasets, download data files, or explore data repositories. Triggers include: requests to "find datasets", "search for data", "download dataset from Kaggle", "get data from Hugging Face", "find ML datasets", or mentions of data repositories like Kaggle, UCI ML Repository, Data.gov, or Hugging Face. Also use for previewing dataset statistics, generating data cards, or discovering datasets for machine learning projects. Requires OpenClawCLI installation from clawhub.ai.
⭐ 1· 1.5k·7 current·7 all-time
byAnis Afifi@anisafifi
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The skill's code and SKILL.md implement dataset search/download/preview across Kaggle, Hugging Face, UCI and saving results locally — which matches the stated purpose. However the SKILL.md and README repeatedly say 'Requires OpenClawCLI installation from clawhub.ai' even though neither the code nor install instructions use or call an OpenClawCLI binary or service. That claimed dependency is unexplained and inconsistent with the included Python script and requirements.
Instruction Scope
Runtime instructions are limited to running the included Python script, installing listed Python packages, and placing Kaggle/Hugging Face credentials where their respective CLIs/APIs expect them. The script downloads files from public repositories, scrapes UCI (placeholder implementation), and writes datasets under local 'datasets/' directories. There are no instructions to read arbitrary unrelated system files or to exfiltrate data to unknown endpoints.
Install Mechanism
There is no automated install spec (instruction-only for pip install), and the included requirements.txt references standard public Python packages (kaggle, datasets, pandas, requests, beautifulsoup4, etc.). No downloads from obscure URLs or archive extraction steps are present. This is a low-risk install mechanism but requires installing networked packages from PyPI as usual.
Credentials
The registry metadata lists no required environment variables, which aligns with the code. The SKILL.md correctly documents Kaggle credentials (kaggle.json in ~/.kaggle) and optional HF_TOKEN for Hugging Face. That is proportionate to dataset downloads. The anomalous claim that OpenClawCLI is required (and a reference to clawhub.ai) is not justified by the code and should be clarified before use.
Persistence & Privilege
The skill does not request persistent/global privileges, does not set always:true, and contains no installation steps that would modify other skills or system-wide agent configuration. It writes downloaded datasets into local directories (datasets/...), which is expected behavior for its purpose.
What to consider before installing
What to check before installing/using this skill:
- Source & provenance: the package's 'Homepage' and 'Source' are empty and SKILL.md claims a dependency on 'OpenClawCLI (clawhub.ai)' that the code does not use — ask the publisher why OpenClawCLI is required and verify the project's origin before running.
- Review the code yourself: the included scripts/dataset.py performs network requests, scraping, and writes files to datasets/. Inspect it if you have sensitive local data or policies about downloads.
- Credentials: Kaggle requires a local kaggle.json file (stored in ~/.kaggle/) and Hugging Face can use HF_TOKEN; provide only tokens you trust and avoid placing high-privilege secrets unless necessary.
- Run in a sandbox/virtualenv: install Python packages in a virtual environment and run the script with non-root privileges. This limits risk from any unexpected behavior.
- Data & license: downloaded datasets may have licensing or privacy constraints; verify dataset licenses and do not download datasets you are not authorized to use.
- Incomplete/placeholder behavior: UCI repository interaction in the script is a simplified placeholder (hard-coded examples). Expect some features to be incomplete — test with small queries first.
If you cannot verify the OpenClawCLI claim or the publisher identity, treat the skill with extra caution (do not run on a production machine or with privileged credentials).Like a lobster shell, security has layers — review code before you run it.
latestvk972saxgzmz4dhyh4taj6hzhms80s813
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
