Siteone Crawler
ReviewAudited by ClawScan on May 10, 2026.
Overview
This appears to be a legitimate website crawler wrapper, but it relies on an external binary and includes optional upload, credential, and load-test features that should be used carefully.
Before installing, verify that you trust the SiteOne Crawler GitHub release and prefer a pinned, verified version if possible. Use the crawler only on sites you own or are authorized to test, especially for stress/load tests. Keep reports local for private or authenticated sites unless you intentionally use the upload feature and understand the retention.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Installing the skill may lead the agent or user to run a third-party executable that was not included in the reviewed artifacts.
The skill's setup downloads and executes a latest-release binary from GitHub without pinning a version or showing checksum/signature verification.
If neither exists, download the latest release from GitHub: ... curl -sL "$RELEASE_URL" -o /tmp/siteone-crawler.zip ... chmod +x "$INSTALL_DIR/siteone-crawler"
Install SiteOne Crawler only from a trusted source, consider pinning a known version, and verify checksums or signatures when available.
Running load tests against sites you do not own or have permission to test could disrupt service or violate terms of use.
The skill documents load-testing options that can generate significant traffic, and it correctly warns about DoS risk.
$CRAWLER --url="https://example.com" --workers="20" --max-reqs-per-sec="100" --max-depth="1" ... Warning: high worker counts can cause DoS.
Use stress/load-test options only on authorized targets, keep rate limits conservative, and respect robots.txt and site policies unless you have explicit permission.
If used on private or authenticated sites, uploaded reports could contain URLs, page content, headers, or findings that you may not want stored externally.
The crawler can upload HTML audit reports to an external SiteOne endpoint with a retention period.
`--upload-to=<url>` | Upload endpoint | `https://crawler.siteone.io/up`; `--upload-retention=<val>` ... `30d`
Avoid `--upload` for private sites unless you understand the destination and retention policy; prefer local reports or set the shortest appropriate retention and an upload password if needed.
Supplying credentials or cookies gives the crawler access to authenticated content and may expose secrets through command history, process listings, reports, or uploads if used carelessly.
The documented CLI supports optional HTTP auth, cookies, custom headers, and SMTP credentials for protected crawling or email delivery.
`--http-auth=<user:pass>` ... `--cookie=<val>` ... `--mail-smtp-pass=<val>`
Use least-privilege test credentials, avoid putting secrets directly in reusable shell history when possible, and do not combine authenticated crawls with report upload unless approved.
