Crawl4ai Docker Skill

PassAudited by ClawScan on May 1, 2026.

Overview

The artifacts are coherent for a Docker-based web crawler and do not show malicious behavior, but users should notice the persistent local service, optional LLM API keys, and unpinned Docker image.

This skill appears appropriate if you intend to run a local Crawl4AI Docker service. Before installing or using it, pin the Docker image version, protect any `.llm.env` API keys, restrict access to port 11235, and avoid crawling sensitive/private pages through LLM extraction unless you trust the configured provider.

Findings (5)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

If misused, the crawler service could be directed to fetch unintended pages or execute browser-side JavaScript during crawling tasks.

Why it was flagged

The skill documents a JavaScript execution endpoint as part of the crawler API. This is consistent with browser-based scraping, but it is a powerful capability that should only be used against intended targets.

Skill content
| `POST /execute_js` | POST | JavaScript 执行 |
Recommendation

Use the service only for URLs you are authorized to crawl, and avoid enabling or calling JavaScript execution paths unless needed.

What this means

LLM extraction can consume quota or incur charges on the configured provider account.

Why it was flagged

The documentation asks users to place LLM provider API keys in a `.llm.env` file. This is expected for LLM extraction, but it grants the service access to the user's provider account.

Skill content
OPENROUTER_API_KEY=your-api-key ... OPENAI_API_KEY=sk-your-key
Recommendation

Use scoped or low-privilege API keys, keep `.llm.env` out of version control, and monitor provider usage.

What this means

A later Docker pull could run a changed image version compared with the one originally reviewed or tested.

Why it was flagged

The example Docker Compose configuration uses the mutable `latest` tag for an external container image. That is common in examples, but future pulls may retrieve different code.

Skill content
"image": "unclecode/crawl4ai:latest"
Recommendation

Pin the image to a specific version or digest and verify the upstream image source before deployment.

What this means

Content from crawled pages and user extraction instructions may be sent to the configured LLM provider.

Why it was flagged

The LLM extraction examples route crawled content and extraction instructions to a configured LLM provider. This is disclosed and purpose-aligned, but it creates an external provider data flow.

Skill content
"type": "llm", "provider": "openrouter/free", "instruction": "总结网页的主要内容"
Recommendation

Do not use LLM extraction on private or sensitive pages unless the provider's data handling is acceptable.

What this means

The crawler service may continue running and accepting local API requests after setup.

Why it was flagged

The reference configuration publishes the REST API port and keeps the container running until stopped. This is normal for a Docker service, but it is persistent behavior users should intentionally manage.

Skill content
"ports": ["11235:11235"], ... "restart": "unless-stopped"
Recommendation

Run it only in a trusted environment, bind or firewall the port appropriately, and stop the container when it is no longer needed.