LiteBrowse

v0.1.1

Extracts and summarizes the most relevant webpage passages for focused, low-token research without loading or summarizing the full page.

⭐ 0· 133·0 current·0 all-time

by@agitalent

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for agitalent/litebrowse.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "LiteBrowse" (agitalent/litebrowse) from ClawHub.
Skill page: https://clawhub.ai/agitalent/litebrowse
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install litebrowse

ClawHub CLI

Package manager switcher

npx clawhub@latest install litebrowse

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The name, description, SKILL.md, and included script all describe the same functionality: extracting and ranking relevant text blocks from a page. The skill requests no credentials, no unusual binaries, and no config paths — everything is proportionate to a web-extraction helper.

ℹ

Instruction Scope

Instructions are specific: run the bundled Python extractor with a URL or local HTML file and use only the returned blocks. This scope matches the stated purpose. One operational note: the script will perform network fetches for any HTTP(S) URL provided, and will read local files if given — so when executed in an environment where the agent has network access it can reach arbitrary hosts (including internal endpoints). That behavior is expected for a fetcher but carries the usual network/SSRF risk depending on your runtime environment.

✓

Install Mechanism

No install spec — instruction-only with one included Python script. The script uses only Python standard libraries (urllib, html.parser, etc.), so there is no package download or extraction risk.

✓

Credentials

The skill declares no environment variables, no credentials, and no config paths. The script does read either a network URL or a local file path provided at runtime, which is appropriate for its purpose.

✓

Persistence & Privilege

always is false and the skill does not request persistent/system-level privileges or modify other skills. Autonomous invocation is allowed by default but is not combined with broad credentials or suspicious behavior here.

Assessment

This skill appears coherent: it includes a readable Python extractor that fetches a page (or reads a local HTML file), parses it, and returns high-relevance blocks. Before installing or enabling it, consider: (1) review the script (already included) and confirm you trust it; (2) if you run agents in environments with access to internal services, be aware the extractor will fetch arbitrary URLs you or the agent provide — this can be used to access internal endpoints if network access isn't restricted; (3) if you prefer tighter control, run the extractor in a network-restricted sandbox or feed it local HTML snapshots instead of live URLs. If those considerations are acceptable, the skill is consistent with its stated purpose.

Like a lobster shell, security has layers — review code before you run it.

latestvk976e3x7b68vejz7vzfbgmeht583hj1esearchvk976e3x7b68vejz7vzfbgmeht583hj1etoken-efficientvk976e3x7b68vejz7vzfbgmeht583hj1ewebvk976e3x7b68vejz7vzfbgmeht583hj1e

133downloads

0stars

2versions

Updated 1mo ago

v0.1.1

MIT-0

LiteBrowse Skill

Direct access:

Purpose

LiteBrowse is an OpenClaw skill for low-token webpage research.

Use it when:

the user wants facts from a specific webpage
the page is long or cluttered
token cost matters
you need the most relevant passages first instead of full-page dumps

Core Rule

Do not load or summarize the full page first.

Always run the local extractor before reasoning on webpage content:

python3 ./scripts/web_relevance_extract.py "<url-or-html-file>" "<query>"

The extractor returns only the most relevant blocks under a fixed character budget. Use that compact output as the default context for answering.

Required Workflow

Restate the information target as a short query string.

Run:

python3 ./scripts/web_relevance_extract.py "<source>" "<query>" --top-k 5 --max-chars 2400 --format json

Read only the returned blocks.
Answer from those blocks if they are sufficient.
Only if recall is clearly insufficient, rerun with one controlled expansion:
- increase --top-k
- or increase --max-chars
- or narrow / refine the query
Do not jump to raw-page scraping unless the extractor failed.

Budget Discipline

Prefer --max-chars 1200 to 2400 for narrow fact lookup.
Keep --top-k between 3 and 6 unless the user explicitly asks for breadth.
Narrow the query instead of widening the token budget when possible.
If the first run already contains the answer, stop there.

Output Discipline

When answering:

cite which returned block supports the answer
say when the extractor output is incomplete or ambiguous
distinguish extracted text from your inference
do not claim the full page was reviewed unless it actually was

Examples

Find pricing details from a long page:

python3 ./scripts/web_relevance_extract.py "https://example.com/pricing" "pricing tiers api limits enterprise" --max-chars 1600 --top-k 4 --format text

Find job requirements from a careers page:

python3 ./scripts/web_relevance_extract.py "https://example.com/jobs/ml-engineer" "requirements python llm retrieval location" --max-chars 1800 --top-k 5 --format json

Use a saved HTML file:

python3 ./scripts/web_relevance_extract.py "/tmp/page.html" "refund policy cancellation deadline" --max-chars 1200