AI Data Scraper

PassAudited by ClawScan on May 1, 2026.

Overview

No artifact-backed malicious behavior was found; this appears to be a simple scraper, though users should review its broad URL-fetching behavior and overstated feature claims.

This skill looks low risk from the provided artifacts, but it is very minimal. Before using it, confirm you are allowed to scrape the target URL/API, keep outputs under review before sharing, make sure curl is available, and do not assume the advertised scheduling, proxy, retry, deduplication, or monitoring features actually exist.

Findings (3)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

If used on unsafe or unintended targets, the skill could fetch and store data from places the user did not mean to access.

Why it was flagged

The script fetches whatever URL/API value it is given and saves the response locally. This is purpose-aligned for a scraper, but it is broad network-fetching capability without visible scheme, domain, or target restrictions.

Skill content
local data=$(curl -sL "$URL" --compressed) ... echo "$data" > "$output_file"
Recommendation

Use it only with authorized http(s) URLs or APIs, and review generated files before sharing them.

What this means

The skill may fail or behave differently depending on whether curl is available in the environment.

Why it was flagged

The script has a runtime dependency on curl, while the registry requirements declare no required binaries. This is an under-declared dependency rather than hidden installation behavior.

Skill content
if ! command -v curl &> /dev/null; then
        log_error "curl not installed"
Recommendation

Confirm curl is installed and consider updating metadata to declare the dependency.

What this means

Users may assume reliability, monitoring, or anti-duplication features exist when they are not shown in the provided artifacts.

Why it was flagged

The documentation advertises proxy pools, retries, deduplication, and real-time monitoring, but the supplied script only performs direct curl fetches and local writes. This is a capability overstatement, not evidence of hidden behavior.

Skill content
- ✅ 代理池支持
- ✅ 自动重试
- ✅ 数据去重
- ✅ 实时监控
Recommendation

Do not rely on the advertised advanced features unless they are implemented and reviewed.