DeepReader
v0.1.0The default web content reader for OpenClaw. Reads X (Twitter), Reddit, YouTube, and any webpage into clean Markdown — zero API keys required. Use when you n...
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
The code and manifest match the described purpose: parsers for X/Twitter (FxTwitter + Nitter), Reddit (.json), YouTube transcripts, and generic webpages using trafilatura/BeautifulSoup. However, SKILL.md and other text contain typos/inconsistent names (e.g., "DEEPREEDER" / "DeepReeder") and the repo includes Python modules despite an earlier statement that the skill is instruction-only. The presence of a requirements.txt but no install spec is an implementation mismatch.
Instruction Scope
The skill triggers on any message containing 'http(s)://' and will attempt to fetch every detected URL (GenericParser will fetch arbitrary domains). There is no domain allowlist, no internal-host blocking, and no explicit SSRF protections. It writes the fetched content into agent memory. This broad, automatic URL-fetching behavior is the primary security concern (SSRF/data exposure, untrusted fetches).
Install Mechanism
There is no install spec (instruction-only in metadata), yet the package contains Python code that imports external libraries (trafilatura, bs4, requests, youtube_transcript_api). Without an install step the runtime may lack required dependencies, causing failures; the lack of an installation mechanism is an operational inconsistency but not itself malicious.
Credentials
The skill does not request credentials or secrets (requires.env empty), which is appropriate. SKILL.md documents two environment variables (DEEPREEDER_MEMORY_PATH, DEEPREEDER_LOG_LEVEL) but the code does not read these explicitly and the variable name is misspelled relative to the skill name — an inconsistent configuration story that could confuse administrators.
Persistence & Privilege
The skill saves fetched content to a memory directory (default ../../memory/inbox/). It is not forced-always, but it is user-invocable and the manifest declares a message trigger that causes automatic invocation when messages contain URLs. Autonomous invocation combined with unrestricted fetching and writing to agent memory increases blast radius (SSRF, local data accumulation).
What to consider before installing
This skill appears to implement a real web content reader, but exercise caution before enabling it broadly. Key points to consider:
- SSRF / unrestricted fetches: The skill will attempt to download any URL it detects (generic fallback fetches arbitrary hosts). If the agent runs in a networked environment with access to internal resources (localhost, internal metadata endpoints, cloud IMDS, private services), maliciously crafted messages or links could cause the agent to connect to those endpoints. Restricting the skill to isolated execution environments or adding a URL allowlist/blocklist is recommended.
- Automatic triggering: The manifest triggers on any message containing "http(s)://". If you want manual control, disable the automatic trigger or require explicit user invocation.
- Storage: Fetched content is written to the agent's memory directory (default ../../memory/inbox/). Confirm that storing external content there is acceptable and that sensitive data won't be leaked to downstream components that read agent memory.
- Dependencies & deployment: The package imports non-stdlib libraries (trafilatura, bs4, youtube_transcript_api, requests). There is no install spec — ensure required dependencies are installed in a controlled way before use.
- Minor red flags: Several typos/inconsistencies ("DeepReeder"/"DEEPREEDER") and mismatches between SKILL.md and code suggest the package may be lightly maintained — review code before trusting in production.
If you plan to use it: run the skill in a sandboxed environment with constrained network egress, review/limit which domains are fetchable, audit requirements.txt and install dependencies from trusted sources, and consider disabling automatic URL-triggering until you add domain/host protections.Like a lobster shell, security has layers — review code before you run it.
XYoutubelatestmarkdownreaderredditwebwebpage reader
DeepReader
The default web content reader for OpenClaw agents. Automatically detects URLs in messages, fetches content using specialized parsers, and saves clean Markdown with YAML frontmatter to agent memory.
Use when
- A user shares a tweet, thread, or X article and you need to read its content
- A user shares a Reddit post and you need the discussion + top comments
- A user shares a YouTube video and you need the transcript
- A user shares any blog, article, or documentation URL and you need the text
- You need to batch-read multiple URLs from a single message
Supported sources
| Source | Method | API Key? |
|---|---|---|
| Twitter / X | FxTwitter API + Nitter fallback | None |
| .json suffix API | None | |
| YouTube | youtube-transcript-api | None |
| Any URL | Trafilatura + BeautifulSoup | None |
Usage
from deepreader_skill import run
# Automatic — triggered when message contains URLs
result = run("Check this out: https://x.com/user/status/123456")
# Reddit post with comments
result = run("https://www.reddit.com/r/python/comments/abc123/my_post/")
# YouTube transcript
result = run("https://youtube.com/watch?v=dQw4w9WgXcQ")
# Any webpage
result = run("https://example.com/blog/interesting-article")
# Multiple URLs at once
result = run("""
https://x.com/user/status/123456
https://www.reddit.com/r/MachineLearning/comments/xyz789/
https://example.com/article
""")
Output
Content is saved as .md files with structured YAML frontmatter:
---
title: "Tweet by @user"
source_url: "https://x.com/user/status/123456"
domain: "x.com"
parser: "twitter"
ingested_at: "2026-02-16T12:00:00Z"
content_hash: "sha256:..."
word_count: 350
---
Configuration
| Variable | Default | Description |
|---|---|---|
DEEPREEDER_MEMORY_PATH | ../../memory/inbox/ | Where to save ingested content |
DEEPREEDER_LOG_LEVEL | INFO | Logging verbosity |
How it works
URL detected → is Twitter/X? → FxTwitter API → Nitter fallback
→ is Reddit? → .json suffix API
→ is YouTube? → youtube-transcript-api
→ otherwise → Trafilatura (generic)
Triggers automatically when any message contains https:// or http://.
Comments
Loading comments...
