news_scraper
v1.0.0This skill should be used when users need to scrape hot news topics from Chinese platforms (微博、知乎、B站、抖音、今日头条、腾讯新闻、澎湃新闻), generate summaries, and cite sources...
⭐ 0· 133·1 current·1 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description match the delivered artifacts: scripts implement API-based and direct scraping for the listed Chinese platforms and provide extractive/abstractive summarization. Required resources (requests, BeautifulSoup, optional transformers/jieba) are appropriate. Minor documentation mismatch: README suggests a pip package name and a project homepage URL (clawhub.com/workbuddy/...), but registry metadata shows no homepage — this may be an authoring/documentation inconsistency rather than a security concern.
Instruction Scope
SKILL.md instructs running the included Python scripts and references only the platforms and APIs relevant to the task. The runtime instructions and code perform HTTP GET requests to public endpoints (target sites and the aggregator uapis.cn) and read/write only local JSON/Markdown files; they do not attempt to read unrelated local files or environment secrets.
Install Mechanism
No install spec is provided (instruction-only skill); code is included but no automated installers or external archive downloads are executed by the skill. Dependencies are user-managed via pip instructions in docs. Note: optional dependencies like transformers/torch will download large models from Hugging Face at runtime if used.
Credentials
The skill does not request environment variables, credentials, or config paths. All network calls are to public aggregator API (uapis.cn) and target websites; no hidden endpoints or secret exfiltration are present in the included code.
Persistence & Privilege
The skill does not request permanent inclusion (always=false) and does not modify other skills or global agent settings. It only reads/writes its own output files (JSON/Markdown).
Assessment
This skill appears to do what it claims, but consider the following before installing or running it:
- Legal/compliance: Scraping some Chinese platforms may violate their terms of service — check each site's TOS and robots.txt and ensure your use is permitted.
- Third‑party aggregator: The code calls a free aggregator (uapis.cn). Verify that service's reliability, privacy policy, and whether you trust it for production use.
- Resource & network: Generating abstractive summaries with transformers/torch will download models (Hugging Face) and can be CPU/GPU and disk intensive; plan for bandwidth and storage. If you don't install those deps, the abstractive mode will fail.
- Rate limits & politeness: Use the provided delays and consider proxies/captcha handling if you scale; avoid high-frequency scraping to prevent IP blocking or harming target sites.
- Documentation inconsistency: README suggests a pip package and project homepage that don't match the registry metadata here — verify the package/source if you intend to pip install rather than run the local scripts.
- Data handling: The scripts save scraped content locally; ensure you do not collect or store personal/private data unintentionally, and sanitize outputs if required.
If you need higher assurance, ask the author for the canonical source repository or release, and audit the full (untruncated) news_scraper.py file in your environment before running.Like a lobster shell, security has layers — review code before you run it.
latestvk975jcrgt2b69msnb6zt5px0h983b6kn
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
