XPR Web Scraping
Tools for fetching and extracting cleaned text, metadata, and links from single or multiple web pages with format options and link filtering.
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 0 · 1.6k · 11 current installs · 11 all-time installs
by@paulgnz
MIT-0
Security Scan
OpenClaw
Benign
medium confidencePurpose & Capability
Name/description (fetching, extracting text/links/metadata) match the actual tools and code: scrape_url, extract_links, scrape_multiple. No unrelated env vars, binaries, or services are requested.
Instruction Scope
SKILL.md describes limited scraping actions (single page, link extraction, multi-page up to 10). Instructions recommend rate-limiting and content-size limits and do not instruct access to unrelated files, credentials, or external endpoints beyond the target pages.
Install Mechanism
No install spec; skill is instruction-plus-code and relies on built-in Node fetch. No downloads, package registry installs, or archive extraction are present in the provided metadata.
Credentials
Skill requires no environment variables, credentials, or config paths. The code uses only network fetch and in-memory parsing; requested access is proportional to web-scraping functionality.
Persistence & Privilege
always is false and disable-model-invocation is false (normal). The skill does not request persistent system-wide privileges or modify other skills. Autonomous invocation is allowed by platform default but not combined with other red flags.
Assessment
This skill appears to be a coherent, self-contained web scraper that doesn't request secrets or install external code. Before installing: (1) review the full src/index.ts (the provided snippet was truncated) to confirm there are no hidden network callbacks or logging endpoints; (2) ensure use complies with target sites' robots.txt, terms of service, and legal/privacy rules; (3) enforce rate limits and avoid scraping protected or paywalled content; (4) if you run in a sensitive environment, sandbox the skill (or review for any unexpected outbound endpoints) before enabling autonomous invocation.Like a lobster shell, security has layers — review code before you run it.
Current versionv0.2.11
Download zipextractionlatestweb-scrapingxpr
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
Web Scraping
You have web scraping tools for fetching and extracting data from web pages:
Single page:
scrape_url— fetch a URL and get cleaned text content + metadata (title, description, link count)- Use format="text" (default) for most tasks — strips all HTML
- Use format="markdown" to preserve headings, links, lists, bold/italic
- Use format="html" only when you need raw HTML
Link discovery:
extract_links— fetch a page and extract all links with text and type (internal/external)- Use the
patternparameter to filter by regex (e.g."\\.pdf$"for PDF links) - Links are deduplicated and resolved to absolute URLs
- Use the
Multi-page research:
scrape_multiple— fetch up to 10 URLs in parallel for comparison/research- One failure doesn't block others (uses Promise.allSettled)
Best practices:
- Prefer "text" format for content extraction, "markdown" for preserving structure
- Don't scrape the same domain more than 5 times per minute
- Combine with
store_deliverableto save scraped content as job evidence - For very large pages, the content is limited to 5MB
Files
3 totalSelect a file
Select a file to preview.
Comments
Loading comments…
