Image-crawler

v1.0.0

图片采集/爬虫工具,支持百度和Bing图片搜索引擎。当用户要求采集、爬取、下载、 搜集图片时使用。支持关键词拓展、图片去重(URL+内容hash,跨次运行持久化)、 进度监控和停滞检测。触发词:采集图片、爬取图片、下载图片、图片爬虫、抓取图片。

0· 66·0 current·0 all-time
byMagicWolf@mx2013713828
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
Name/description match the provided scripts: the package contains crawler implementations for Baidu and Bing and a wrapper script that coordinates search, download, deduplication and progress reporting. However, the registry metadata declared no required binaries or environment variables while the SKILL.md and scripts assume a Python runtime and the 'requests' library; that runtime dependency is not declared in the metadata.
Instruction Scope
SKILL.md instructs the agent to extract keywords, expand them, run the bundled Python script in JSON mode and monitor its line-delimited JSON output. The instructions stay within the crawler's scope and do not request unrelated files, system credentials, or external endpoints beyond search engines and target image hosts. Use of the LLM to expand keywords is intentional for coverage and is documented.
Install Mechanism
This is an instruction-only skill (no install spec). The included code runs as Python scripts and makes network calls. There is no remote download/installation of code at install time and no obscure third-party install URLs. Note: the script will exit if 'requests' is not installed and prints instructions to pip install it — the dependency should be declared.
Credentials
The skill requests no environment variables or credentials and does not attempt to access system config paths beyond writing to the user-specified output directory. Network access to Bing, Baidu, and arbitrary image hosts is required and expected for its purpose.
Persistence & Privilege
The skill does not request permanent 'always' inclusion, nor does it modify other skills or system-wide settings. It persists deduplication hashes to a file under the chosen output directory (.dedup_hashes.json), which is consistent with stated behavior.
Assessment
This skill appears to do what it says (scrape images from Baidu/Bing and deduplicate). Before installing or running it: (1) ensure you run it in a controlled environment (sandbox or non-privileged account) because it will download many files and use network bandwidth; (2) install Python and the 'requests' package (pip install requests) — the skill doesn't declare this dependency in metadata; (3) set a safe output directory and disk quota to avoid filling your disk; (4) respect website terms of service and robots.txt and be aware of legal/ethical issues with mass scraping; (5) consider lowering concurrency and increasing delays (the code already exposes sleep/timeouts) to reduce anti-scraping risk; (6) review the scripts for any changes if you plan to run them on sensitive hosts — although no hidden network sinks or credential access were found, the crawler will fetch arbitrary external URLs, which can host unexpected content; (7) do not run as root/administrator and avoid supplying any unrelated credentials to the skill. If you want higher assurance, ask the publisher to update the metadata to declare Python and requests as required, and provide an explicit dependency/install instruction.

Like a lobster shell, security has layers — review code before you run it.

latestvk97by7aynyns881yj5wxyhq11x83r6bp

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments