Site Summarizer

v4.1.0

URL fetcher with summarization. Fetches URLs, extracts content, generates summaries. Optional caching with configurable directory and TTL. Use for web conten...

1· 111·0 current·0 all-time
byCJ Hauser@cloudcompile

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for cloudcompile/site-summarizer.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Site Summarizer" (cloudcompile/site-summarizer) from ClawHub.
Skill page: https://clawhub.ai/cloudcompile/site-summarizer
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install site-summarizer

ClawHub CLI

Package manager switcher

npx clawhub@latest install site-summarizer
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (URL fetcher + summarizer) align with the provided script and SKILL.md. Requested env vars (cache dir, TTL, hide IP) are appropriate for caching/privacy features. No unrelated binaries or credentials are requested.
Instruction Scope
SKILL.md instructs running the included Python script and documents the env vars used. The script's actions (DNS resolution, TCP/SSL GET, HTML extraction, summarization, optional caching) are exactly what the description promises. It only reads/writes cache files in its own directory and does not instruct the agent to read arbitrary unrelated files or secrets.
Install Mechanism
There is no install spec (instruction-only with an included Python file). Nothing is downloaded from external URLs or installed on the system beyond writing its own cache files, so install risk is low.
Credentials
No credentials or secrets are requested. Three environment variables control cache dir, TTL, and IP-hiding behavior — all justified by the skill's caching/privacy features.
Persistence & Privilege
Skill does not request always: true and does not change other skills or system-wide configuration. It stores cache files under a user-scoped directory (default ~/.cache/site-summarizer), which is proportionate to its function.
Assessment
This skill appears to do what it says: it connects to arbitrary HTTP/HTTPS URLs, extracts and summarizes page content, and optionally caches results under ~/.cache/site-summarizer (or a custom directory via SITE_SUMMARIZER_CACHE_DIR). It does not ask for API keys or post data to third-party endpoints. Before installing, consider: (1) it performs network requests — only use it where making outbound connections is acceptable; (2) it writes cache files to your home directory — inspect or configure SITE_SUMMARIZER_CACHE_DIR if that is a concern; (3) it attempts to block private/cloud metadata IPs, but no blocking is perfectly foolproof — avoid using it in environments where querying internal services would be risky. If you need stronger isolation, run the script in a sandboxed environment or container.

Like a lobster shell, security has layers — review code before you run it.

latestvk97d4h3gzfdkgcm3nd5w6qtrgh847e6s
111downloads
1stars
8versions
Updated 3w ago
v4.1.0
MIT-0

Site Summarizer v4.1.0

Features

  • Content extraction from HTML pages
  • Automatic summarization
  • Keyword extraction and language detection
  • Optional caching with env vars

Output

{
  "success": true,
  "content": "...",
  "summary": "...",
  "metadata": {"title": "...", "description": "...", "author": "..."},
  "analysis": {"language": "en", "keywords": [...], "word_count": N, "read_time_min": N},
  "status": 200,
  "from_cache": false
}

Environment Variables

  • SITE_SUMMARIZER_CACHE_DIR - Cache directory (default: ~/.cache/site-summarizer)
  • SITE_SUMMARIZER_CACHE_TTL - Cache TTL in seconds (default: 3600)
  • SITE_SUMMARIZER_HIDE_IP - Set to "true" to hide resolved IP in output

Usage

python fetch_and_summarize.py <url>

v4.1.0 Fixes

  • Fixed redirect header parsing
  • Fixed regex patterns for redaction
  • Added optional IP hiding via env var
  • Code cleanup and bug fixes

Comments

Loading comments...