Crawl By Desearch
v1.0.1Crawl/scrape and extract content from any webpage URL. Returns the page content as clean text or raw HTML. Use this when you need to read the full contents o...
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description (crawl/scrape pages) match the included CLI script which calls https://api.desearch.ai/web/crawl. The only required secret is DESEARCH_API_KEY, which is appropriate for a hosted crawl API.
Instruction Scope
SKILL.md instructs the agent to call the Desearch API and set DESEARCH_API_KEY; the included script does exactly that. Minor inconsistency: SKILL.md states responses are plain text or raw HTML (not JSON), but the script will pretty-print any JSON object returned by the API. This is a benign mismatch in how results are presented.
Install Mechanism
There is no install step; the skill is instruction-only with a small included Python script that uses only the standard library (urllib). No downloads or archive extraction are performed.
Credentials
Only one environment variable (DESEARCH_API_KEY) is required and is directly used to authorize requests to the stated API. No unrelated secrets, config paths, or excessive permissions are requested.
Persistence & Privilege
The skill does not request always:true and does not modify other skills or system-wide settings. Default autonomous invocation is allowed (platform default) but not combined with other concerning privileges.
Assessment
This skill is a thin client for the Desearch API and appears coherent. Before installing, confirm you trust desearch.ai and that you are comfortable sending target URLs (and their contents) to that external service. Treat the DESEARCH_API_KEY like any API secret: use least-privilege keys if supported, rotate keys periodically, and avoid using a key that has broader account permissions than necessary. Note the SKILL.md says the response is plain text/HTML but the script may return JSON objects from the API and pretty-print them — this is informational only. If you need offline/local crawling or want guarantees about sensitive content, do not send private pages to a third-party API.Like a lobster shell, security has layers — review code before you run it.
Runtime requirements
🕷️ Clawdis
EnvDESEARCH_API_KEY
latest
Crawl Webpage By Desearch
Extract content from any webpage URL. Returns clean text or raw HTML.
Quick Start
- Get an API key from https://console.desearch.ai
- Set environment variable:
export DESEARCH_API_KEY='your-key-here'
Usage
# Crawl a webpage (returns clean text by default)
scripts/desearch.py crawl "https://en.wikipedia.org/wiki/Artificial_intelligence"
# Get raw HTML
scripts/desearch.py crawl "https://example.com" --crawl-format html
Options
| Option | Description |
|---|---|
--crawl-format | Output content format: text (default) or html |
Examples
Read a documentation page
scripts/desearch.py crawl "https://docs.python.org/3/tutorial/index.html"
Get raw HTML for analysis
scripts/desearch.py crawl "https://example.com/page" --crawl-format html
Response
Example (format=text, truncated, default)
Artificial intelligence (AI) is the capability of computational systems to perform tasks that typically require human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making...
Example (format=html, truncated)
<!DOCTYPE html>
<html>
<head><title>Artificial intelligence - Wikipedia</title></head>
<body>
<p>Artificial intelligence (AI) is the capability of computational systems...</p>
</body>
</html>
Notes
- Response is plain text or raw HTML — not JSON.
- Default format is
text. Use--crawl-format htmlonly when you need to inspect page structure. - Prefer
textformat to avoid bloating the agent context with markup.
Errors
Status 401, Unauthorized (e.g., missing/invalid API key)
{
"detail": "Invalid or missing API key"
}
Status 402, Payment Required (e.g., balance depleted)
{
"detail": "Insufficient balance, please add funds to your account to continue using the service."
}
Resources
Comments
Loading comments...
