Safe Smart Web Fetch

v1.0.0

安全网页抓取技能。获取网页内容时,默认先判断 URL 是否可能包含 token、是否为内网/本地域名、是否为私密链接;这三类一律不走第三方清洗服务,只走直接抓取。其余公开网页可按顺序尝试 Jina Reader、markdown.new、defuddle.md 获取干净 Markdown,失败再回退原始抓取。

0· 73·0 current·0 all-time
byQihong@zqh2333
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
Capability signals
Requires OAuth token
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description, SKILL.md, and the included Python script all implement the same functionality: classify URLs as private/sensitive or public, and for public pages attempt third‑party 'cleaners' before falling back to raw fetch. No unrelated credentials, binaries, or config paths are requested.
Instruction Scope
Instructions and script only perform URL classification and HTTP(S) fetches, and call the listed third‑party cleaners (r.jina.ai, markdown.new, defuddle.md) for public pages. They do not read local files or environment variables. However, URL classification is heuristic; misclassification (false negatives in sensitive detection) could cause an otherwise-sensitive URL to be sent to external services, which is a privacy risk. The SKILL.md states the intended protections and the code implements them, but these are heuristic protections, not provably complete.
Install Mechanism
No install spec; it's an instruction/script-only skill that runs with standard Python stdlib modules (urllib, ssl, ipaddress). Nothing is downloaded or written to disk at install time.
Credentials
The skill requests no environment variables, credentials, or config paths. All network calls are to the target URL or to the explicit third‑party cleaner endpoints documented in the SKILL.md; these calls are coherent with the stated purpose.
Persistence & Privilege
Skill does not request permanent/always inclusion, does not alter global OpenClaw configuration, and contains no mechanism to persist new credentials or modify other skills. Default autonomous invocation is allowed but not combined with other concerning privileges.
Assessment
This skill appears coherent and does what it says: it classifies URLs and, for public pages, tries third‑party cleaners before falling back to original fetch. Before installing or running it against sensitive content, consider: 1) review and, if necessary, replace or restrict the third‑party endpoints (r.jina.ai, markdown.new, defuddle.md) to services you trust or to an internal sanitizer; 2) test the classifier against representative internal/private URLs to confirm it blocks them from third‑party calls (heuristics may miss uncommon token names or edge hostnames); and 3) if you need stronger guarantees, add explicit allow/deny lists or require explicit user confirmation before sending any URL containing query parameters or single‑label hosts to external services.

Like a lobster shell, security has layers — review code before you run it.

latestvk97196asb0xkw8h2zy86rrfeax84cekv

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments