Install
openclaw skills install @sakurakilove/local-searchLocal web search skill that runs entirely on the user's machine. Scrapes DuckDuckGo / Bing / Google HTML directly with automatic engine fallback (DDG -> Bing -> Google). Use whenever the user needs real-time web information, latest news, or content beyond the knowledge cutoff. Returns structured results (url / name / snippet / host_name / rank / date / favicon) plus source_engine, raw_html, score extensions. Supports --num, --recency-days, --locale (BCP-47), --json, --output. No API key, no SDK, no cloud hop.
openclaw skills install @sakurakilove/local-searchLocal web search with automatic engine fallback. Direct HTML scraping of DuckDuckGo / Bing / Google · resilient to rate-limiting · canonical result schema.
This skill calls the public search engines directly from your machine. No API key, no SDK, no network hop. If one engine is rate-limiting you, the orchestrator silently falls through to the next — so the call usually succeeds on the first try, even from datacenter IPs where Google is blocked.
Each result is a SearchFunctionResultItem with the canonical 7 fields (url, name, snippet, host_name, rank, date, favicon) plus three optional extension fields: source_engine, raw_html, score. Consumer code written against any similarly-shaped web-search skill works here without changes.
cheerio (HTML parser) and Node's built-in fetch.engine: "auto" (default).--locale en-US / zh-CN / ja-JP / any BCP-47 tag. Critical for non-US IPs where Bing otherwise serves localized results even for English queries.--recency-days 7 restricts to past-week results. DDG supports exact N days; Bing/Google use day/week/month buckets.url / name / snippet / host_name / rank / date / favicon fields, identical in shape to other web-search skills in the ClawHub registry.tsx bin/web-search.ts <query> for one-offs, or import { search } from "local-search" in code.tsx, or zero-config via bun.One-line install (ClawHub CLI):
npx clawhub install @Sakurakilove/local-search
Manual setup (clone + npm install):
# 1. Install (one runtime dep: cheerio)
cd skills/local-search && npm install
# 2. Search (auto-fallback across DDG → Bing; Google excluded from auto chain)
tsx bin/web-search.ts "artificial intelligence"
# 3. Pin a specific engine + locale
tsx bin/web-search.ts "machine learning" --engine bing --locale en-US --num 5
# 4. Recent news as JSON
tsx bin/web-search.ts "AI breakthroughs" --recency-days 7 --json -o ai_news.json
Programmatic use:
import { search } from "local-search";
const outcome = await search("What is the capital of France?", { num: 5 });
if (outcome.success) {
console.log(`Answered by ${outcome.engine} in ${outcome.elapsedMs}ms`);
outcome.results.forEach(r => console.log(`- ${r.name}\n ${r.url}`));
}
Use local-search whenever the user needs information that lives on the public web — beyond what's in the model's training data:
| Engine | Endpoint | API key | Residential IP | Datacenter IP | Recency |
|---|---|---|---|---|---|
| DuckDuckGo | html.duckduckgo.com/html/ (GET) | none | ✅ high | ⚠️ rate-limits under load | df=d<N> (exact days) |
| Bing | www.bing.com/search | none | ✅ high | ✅ high | freshness=d1|w1|m1 (bucketed) |
www.google.com/search | none | ⚠️ medium | ❌ low (enablejs wall) | tbs=qdr:d|w|m|y (bucketed) |
engine: "auto" tries them in order DuckDuckGo → Bing → Google, returning the first non-empty result set. From a datacenter IP the effective chain is DDG → Bing (Google is usually blocked); from a residential IP all three are viable. Edit AUTO_ENGINE_ORDER in src/engines/index.ts to change priority.
Recommended Location: {project_path}/skills/local-search
Extract this skill package to the above path in your project.
fetch).cheerio (HTML parser).Install once in the skill directory:
cd {project_path}/skills/local-search
npm install # or: bun install / pnpm install
The skill is pure TypeScript ESM. You can run it via
tsx(recommended, dev-friendly) orbun(zero-config). A runtime-agnosticnode --experimental-strip-types bin/web-search.tsalso works on Node 22+.
| Aspect | Cloud-search skills | This skill (local-search) |
|---|---|---|
| Search backend | Cloud function / API | Direct HTTP to DuckDuckGo / Bing / Google |
| API key / SDK | Required | None |
| Network path | Client → cloud → search engine | Client → search engine |
| Failure handling | SDK-dependent | Auto-fallback across 3 engines |
| Result schema | SearchFunctionResultItem (7 fields) | Same 7 fields + 3 optional extension fields |
| CLI | SDK-specific | tsx bin/web-search.ts |
local-search/
├── SKILL.md ← you are here
├── LICENSE.txt
├── package.json ← declares the `cheerio` dep
├── tsconfig.json
├── bin/
│ └── web-search.ts ← CLI entry
├── src/
│ ├── index.ts ← public SDK exports
│ ├── search.ts ← orchestrator with auto-fallback
│ ├── types.ts ← SearchFunctionResultItem + options
│ └── engines/
│ ├── _shared.ts ← fetch / parse helpers
│ ├── duckduckgo.ts ← POST https://html.duckduckgo.com/html/
│ ├── bing.ts ← GET https://www.bing.com/search
│ ├── google.ts ← GET https://www.google.com/search
│ └── index.ts ← engine registry + AUTO_ENGINE_ORDER
└── scripts/
└── web_search.ts ← quick-start example
The only runtime dependency is cheerio (HTML parser), declared in package.json. Node 18+'s built-in fetch covers the network layer — no other SDK is imported anywhere in this package.
The CLI lives at bin/web-search.ts. Run it via tsx, bun, or any TS-aware runner.
# Default: auto engine, 10 results, human-readable output
tsx bin/web-search.ts "artificial intelligence"
# Limit number of results
tsx bin/web-search.ts "machine learning" --num 5
# (short option works too)
tsx bin/web-search.ts "machine learning" -n 5
# Force DuckDuckGo only
tsx bin/web-search.ts "latest tech news" --engine duckduckgo
# Force Bing only
tsx bin/web-search.ts "latest tech news" --engine bing
# Force Google only (highest quality, most likely to be rate-limited)
tsx bin/web-search.ts "latest tech news" --engine google
# Default — try DDG → Bing → Google, return first non-empty
tsx bin/web-search.ts "latest tech news" --engine auto
# Results from last 7 days
tsx bin/web-search.ts "cryptocurrency news" --num 10 --recency-days 7
# (short option)
tsx bin/web-search.ts "cryptocurrency news" -r 7
# Write JSON to a file (human-readable banner is suppressed)
tsx bin/web-search.ts "climate change research" --num 5 --json -o search_results.json
# Or pipe JSON to stdout
tsx bin/web-search.ts "AI breakthroughs" --num 3 --recency-days 1 --json > ai_news.json
# Suppress the "engine / timing" banner — just print the results
tsx bin/web-search.ts "react hooks" --quiet
| Flag | Short | Description |
|---|---|---|
--num <N> | -n | Number of results (default: 10, max: 50) |
--engine <id> | -e | duckduckgo | bing | google | auto (default: auto) |
--recency-days <N> | -r | Restrict to results from last N days (default: 0 = no filter) |
--timeout <ms> | — | Per-engine timeout in ms (default: 8000) |
--json | — | Emit JSON instead of human-readable output |
--output <path> | -o | Write JSON to file instead of stdout |
--pretty | — | Pretty-print JSON (default on; pass --no-pretty to disable — not yet supported, omit --json instead) |
--quiet | -q | Suppress engine-info banner |
--help | -h | Show help |
Each result item is a SearchFunctionResultItem:
interface SearchFunctionResultItem {
// ----- Canonical fields (shared shape across ClawHub web-search skills) -----
url: string; // Full URL of the result
name: string; // Title of the page
snippet: string; // Preview text / description
host_name: string; // Domain name (e.g. "en.wikipedia.org")
rank: number; // 1-indexed ranking within the engine's result set
date: string; // Publication / update date; "N/A" when unknown
favicon: string; // Favicon URL (Google S2)
// ----- Extension fields (new in local-search) -----
source_engine: "duckduckgo" | "bing" | "google";
raw_html?: string; // Original HTML of the snippet element (optional)
score?: number; // Heuristic relevance score [0, 100] (optional)
}
If you're writing TypeScript/JavaScript code (not running the CLI), import from src/index.ts:
import { search } from "local-search";
const outcome = await search("What is the capital of France?", { num: 5 });
if (outcome.success) {
console.log(`Engine used: ${outcome.engine}`);
console.log(`Tried in order: ${outcome.enginesTried.join(" → ")}`);
console.log(`Elapsed: ${outcome.elapsedMs}ms`);
for (const item of outcome.results) {
console.log(`- ${item.name} [${item.source_engine}]`);
console.log(` ${item.url}`);
console.log(` ${item.snippet}`);
}
} else {
console.error("Search failed:", outcome.error);
if (outcome.errors) {
// Auto mode: see which engines tried and what they reported
for (const e of outcome.errors) {
console.error(` ${e.engine}:`, e.error);
}
}
}
import { searchOrThrow, AllEnginesFailedError } from "local-search";
try {
const results = await searchOrThrow("JavaScript frameworks", { num: 10 });
console.log(results);
} catch (err) {
if (err instanceof AllEnginesFailedError) {
console.error("All engines failed:", err.errors);
} else {
console.error("Single-engine error:", err);
}
}
import { search } from "local-search";
// Skip the fallback chain — only use DuckDuckGo.
const outcome = await search("quantum computing applications", {
num: 8,
engine: "duckduckgo",
});
const outcome = await search("genomics research", {
num: 5,
recency_days: 30, // past 30 days
});
const outcome = await search("hello world", {
fetchImpl: myMockFetch, // any WhatWG-fetch-compatible function
userAgent: "my-bot/1.0",
timeoutMs: 5000,
});
duckduckgo)https://html.duckduckgo.com/html/ (POST form q=...)df=d<N> form field.bing)https://www.bing.com/search?q=...&count=...freshness=d1 / w1 / m1 bucket (Bing has no arbitrary-N-days URL filter).ensearch=1).google)https://www.google.com/search?q=...&num=.../httpservice/retry/enablejs) with zero parseable results — the engine surfaces this as a soft failure and the auto chain falls through. From residential IPs, Google HTML scraping often works fine.tbs=qdr:d|w|m|y bucket.When engine: "auto" (the default), the orchestrator tries engines in this order:
It returns the first engine that yields ≥ 1 result. If all three fail, the outcome is { success: false, errors: [...] } with every engine's error attached — or, in searchOrThrow mode, an AllEnginesFailedError.
Practical reliability table (observed during testing):
| Caller IP type | DDG | Bing | Effective auto chain | |
|---|---|---|---|---|
| Residential | high | high | medium | DDG → Bing → Google (3 viable) |
| Datacenter / cloud | medium (rate-limits under load) | high | low (enablejs wall) | effectively DDG → Bing |
The order is defined in src/engines/index.ts (AUTO_ENGINE_ORDER); edit that array if you want a different priority.
Issue: Required dependency 'cheerio' is not installed
cd skills/local-search && npm install cheerioIssue: All search engines failed (auto mode)
errors array lists each engine's failure. Common causes:
Issue: DuckDuckGo returns "No results parsed"
auto mode can fall through to Bing.--engine bing or wait.Issue: Google returns "consent / captcha page"
Issue: Results from a single engine look stale / inconsistent
auto chain returns whichever engine answered first — not a merged set.results from multiple search() calls (one per engine).Issue: Cannot find module 'local-search' when importing as SDK
package.json workspaces, or import via a relative path: import { search } from "./skills/local-search/src/index.js".search() call is stateless, but process startup (TS compile via tsx) costs ~200ms. For batch jobs, write a small Node script that loops over queries instead of spawning the CLI per query.num: Each engine over-fetches then truncates; smaller num means slightly less parsing work.Promise.all([...search(q1), search(q2), search(q3)]) — they hit different engines / endpoints concurrently.result.url before redirecting end users (the skill returns the engine's raw URL; some engines occasionally surface tracking redirects).userAgent if your use case requires honesty.A minimal example lives at scripts/web_search.ts:
# Default query
tsx scripts/web_search.ts
# Custom query + num
tsx scripts/web_search.ts "your query" 5
It's intentionally tiny — a smoke-test you can eyeball, not a feature showcase.
cheerio and Node's built-in fetch.url / name / snippet / host_name / rank / date / favicon fields, plus three optional extension fields (source_engine, raw_html, score).engine: "auto" (the default).tsx bin/web-search.ts <query> [--num N] [--engine auto|duckduckgo|bing|google] [--recency-days N] [--json] [-o file.json].import { search, searchOrThrow } from "local-search".cd skills/local-search && npm install.tsx scripts/web_search.ts.The canonical SearchFunctionResultItem shape was inspired by z-ai-web-dev-sdk's MIT-licensed web-search skill. All engine backend code in this package is original.