news_scraper

Security checks across malware telemetry and agentic risk

Overview

The skill appears aimed at public trend collection and summarization, but its proxy guidance for large-scale scraping is overbroad enough to require review before installation.

Review this skill before installing. It may be acceptable for public trend monitoring, but avoid using proxy-based large-scale scraping unless you have authorization and clear rate-limit controls. Prefer official APIs or low-volume access, pin or use local summarization models, and do not process sensitive text with remote model downloads unless your policy allows it.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (3)

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The documentation explicitly recommends use of proxy IPs for large-scale scraping, which goes beyond the stated purpose of fetching hot-topic lists and summarizing them. In this context, proxy guidance can enable evasion of rate limits, anti-bot controls, and source-based blocking, making the skill more suitable for scaled collection against platform restrictions.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The examples instruct users to load HuggingFace models via the transformers pipeline using model identifiers that may trigger outbound network access and remote artifact downloads. In a scraping-and-summarization skill that may process sensitive or unpublished news text, failing to warn about network access and data handling can lead operators to unknowingly transmit metadata or fetch unpinned external dependencies, increasing privacy and supply-chain risk.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The code uses AutoTokenizer.from_pretrained and AutoModelForSeq2SeqLM.from_pretrained with external model names or paths, which commonly causes downloads from remote registries if artifacts are not already cached. In this skill's context, where scraped content from external platforms may include sensitive or regulated data, undocumented remote dependency retrieval creates avoidable privacy, compliance, and supply-chain exposure.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal