LiteBrowse

v0.1.1

Extracts and summarizes the most relevant webpage passages for focused, low-token research without loading or summarizing the full page.

0· 133·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for agitalent/litebrowse.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "LiteBrowse" (agitalent/litebrowse) from ClawHub.
Skill page: https://clawhub.ai/agitalent/litebrowse
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install litebrowse

ClawHub CLI

Package manager switcher

npx clawhub@latest install litebrowse
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The name, description, SKILL.md, and included script all describe the same functionality: extracting and ranking relevant text blocks from a page. The skill requests no credentials, no unusual binaries, and no config paths — everything is proportionate to a web-extraction helper.
Instruction Scope
Instructions are specific: run the bundled Python extractor with a URL or local HTML file and use only the returned blocks. This scope matches the stated purpose. One operational note: the script will perform network fetches for any HTTP(S) URL provided, and will read local files if given — so when executed in an environment where the agent has network access it can reach arbitrary hosts (including internal endpoints). That behavior is expected for a fetcher but carries the usual network/SSRF risk depending on your runtime environment.
Install Mechanism
No install spec — instruction-only with one included Python script. The script uses only Python standard libraries (urllib, html.parser, etc.), so there is no package download or extraction risk.
Credentials
The skill declares no environment variables, no credentials, and no config paths. The script does read either a network URL or a local file path provided at runtime, which is appropriate for its purpose.
Persistence & Privilege
always is false and the skill does not request persistent/system-level privileges or modify other skills. Autonomous invocation is allowed by default but is not combined with broad credentials or suspicious behavior here.
Assessment
This skill appears coherent: it includes a readable Python extractor that fetches a page (or reads a local HTML file), parses it, and returns high-relevance blocks. Before installing or enabling it, consider: (1) review the script (already included) and confirm you trust it; (2) if you run agents in environments with access to internal services, be aware the extractor will fetch arbitrary URLs you or the agent provide — this can be used to access internal endpoints if network access isn't restricted; (3) if you prefer tighter control, run the extractor in a network-restricted sandbox or feed it local HTML snapshots instead of live URLs. If those considerations are acceptable, the skill is consistent with its stated purpose.

Like a lobster shell, security has layers — review code before you run it.

latestvk976e3x7b68vejz7vzfbgmeht583hj1esearchvk976e3x7b68vejz7vzfbgmeht583hj1etoken-efficientvk976e3x7b68vejz7vzfbgmeht583hj1ewebvk976e3x7b68vejz7vzfbgmeht583hj1e
133downloads
0stars
2versions
Updated 1mo ago
v0.1.1
MIT-0

LiteBrowse Skill

Direct access:

Purpose

LiteBrowse is an OpenClaw skill for low-token webpage research.

Use it when:

  • the user wants facts from a specific webpage
  • the page is long or cluttered
  • token cost matters
  • you need the most relevant passages first instead of full-page dumps

Core Rule

Do not load or summarize the full page first.

Always run the local extractor before reasoning on webpage content:

python3 ./scripts/web_relevance_extract.py "<url-or-html-file>" "<query>"

The extractor returns only the most relevant blocks under a fixed character budget. Use that compact output as the default context for answering.

Required Workflow

  1. Restate the information target as a short query string.
  2. Run:
    python3 ./scripts/web_relevance_extract.py "<source>" "<query>" --top-k 5 --max-chars 2400 --format json
    
  3. Read only the returned blocks.
  4. Answer from those blocks if they are sufficient.
  5. Only if recall is clearly insufficient, rerun with one controlled expansion:
    • increase --top-k
    • or increase --max-chars
    • or narrow / refine the query
  6. Do not jump to raw-page scraping unless the extractor failed.

Budget Discipline

  • Prefer --max-chars 1200 to 2400 for narrow fact lookup.
  • Keep --top-k between 3 and 6 unless the user explicitly asks for breadth.
  • Narrow the query instead of widening the token budget when possible.
  • If the first run already contains the answer, stop there.

Output Discipline

When answering:

  • cite which returned block supports the answer
  • say when the extractor output is incomplete or ambiguous
  • distinguish extracted text from your inference
  • do not claim the full page was reviewed unless it actually was

Examples

Find pricing details from a long page:

python3 ./scripts/web_relevance_extract.py "https://example.com/pricing" "pricing tiers api limits enterprise" --max-chars 1600 --top-k 4 --format text

Find job requirements from a careers page:

python3 ./scripts/web_relevance_extract.py "https://example.com/jobs/ml-engineer" "requirements python llm retrieval location" --max-chars 1800 --top-k 5 --format json

Use a saved HTML file:

python3 ./scripts/web_relevance_extract.py "/tmp/page.html" "refund policy cancellation deadline" --max-chars 1200

Failure Handling

If the page cannot be fetched or parsed:

  • report the fetch or parse failure directly
  • ask for a local HTML copy if network access is blocked
  • do not fabricate an answer from URL guesses

Comments

Loading comments...