Legado Book Source Developer

Security checks across malware telemetry and agentic risk

Overview

This skill is mostly a Legado book-source development toolkit, but it includes under-scoped anti-bot bypass, credential/session handling, and upload capabilities that users should review carefully before installing.

Install only if you are comfortable with a scraping-oriented development skill that can guide network requests, cookie/session use, local snapshot storage, and source uploads. Do not give the agent passwords, MFA codes, live cookies, or Authorization headers; use it only on sites you own or are authorized to test, avoid the anti-bot bypass material, and review any upload destination before running the upload tool.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (38)

Context-Inappropriate Capability

Medium

Confidence: 87% confidence
Finding: This example explicitly teaches direct scripted network posting via java.post(...), which materially increases the skill's operational capability from passive rule authoring to active request execution. In an agent setting, this can be repurposed to send arbitrary HTTP requests to third-party sites, bypass simpler guardrails around selector-only assistance, and facilitate unauthorized scraping or interaction with remote endpoints.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The guide documents login URLs, cookie persistence, manual Cookie/Authorization headers, and login-state scripts, which materially expand the skill from passive rule authoring into authenticated-session handling. In a scraping-oriented context, this can enable users to collect, persist, and replay sensitive session material against third-party sites without clear scope limits or safety guardrails.

Context-Inappropriate Capability

Medium

Confidence: 80% confidence
Finding: Describing WebView event listeners and custom-button interactions extends the skill into active browser-style interaction and page automation, which is beyond simple HTML/rule analysis. That capability can be repurposed to drive gated flows, trigger unintended actions, or bypass assumptions that the skill is only for parsing public content.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: This file implements broad website account, bookshelf, bookmark, voting, and report functionality that is materially outside the declared skill scope of Legado book-source development. In a skill context, this mismatch is dangerous because it increases the attack surface and could enable collection or misuse of user credentials and account tokens under the guise of a developer-assistance tool.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The account-management functions include login, registration, password change, profile retrieval, profile editing, and logout, none of which are necessary for book-source rule development. In the stated skill context this is suspicious and risky because it normalizes credential entry and token storage in a tool that users would not expect to manage website accounts.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: Bookshelf, bookmark, voting, analytics, and feedback/report actions are unrelated to the declared Legado source-development purpose and perform authenticated or user-linked operations against backend APIs. This is dangerous because hidden or unjustified action-capable code can be abused to act on behalf of users or to exfiltrate behavioral/account data in a context where such capabilities are not expected.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The knowledge index enumerates many third-party book sources, including targets that appear unrelated to a narrowly scoped, authorized-development helper. In a skill meant for "user-owned or authorized websites," bundling a catalog of external sources materially increases the likelihood that the agent can assist scraping or bypassing access controls for unauthorized sites, which expands abuse potential beyond the declared scope.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The indexed content includes unrelated domains such as video, music, external apps, and live-stream/sports sources, while the skill is presented as a Legado book-source development assistant. This scope creep is risky because it broadens operational capability into areas more associated with scraping, deep-link launching, or circumvention workflows that are unnecessary for the claimed purpose.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The guide explicitly tells the operator to ask the user for login credentials when anti-scraping is encountered. That expands the skill from book-source debugging into credential collection and handling, creating unnecessary exposure of highly sensitive secrets and enabling misuse if the skill, logs, or downstream tooling are compromised. In the context of a scraping-oriented skill, this is more dangerous because credentialed access can bypass site protections and normalize unsafe secret-sharing behavior.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The code explicitly detects Cloudflare and slider-based anti-bot pages, then attempts to defeat them using WebView execution, crafted cookies, encrypted payloads, and replay/generated interaction traces. That goes well beyond legitimate Legado source debugging and materially enables automated access to protected sites, which is dangerous because it facilitates bypass of access controls and anti-abuse defenses.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The skill contains dedicated CAPTCHA/anti-bot circumvention capability, including challenge detection, browser automation, token harvesting, crypto setup, and multiple automated solve attempts. In the stated context of book-source development, this is unjustified and increases risk because it operationalizes evasion of website protections rather than supporting normal rule authoring.

Description-Behavior Mismatch

Medium

Confidence: 87% confidence
Finding: The documentation includes a dedicated upload tool for sending generated book-source files to third-party hosting services, which expands the skill from local analysis/development into external distribution. That increases the risk of unintended data exfiltration, especially if generated files contain proprietary rules, internal URLs, cookies, or other sensitive metadata not meant for third parties.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: The custom upload configuration allows arbitrary external endpoints and authorization headers, effectively exposing a generic file-exfiltration primitive under the guise of book-source sharing. In an agent context, this is dangerous because it can be repurposed to send generated artifacts or other accessible files to attacker-controlled infrastructure with bearer tokens or other credentials.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The document includes a remote self-update/import mechanism that fetches a JSON subscription source from multiple external URLs and opens a `legado://import/rssSource` deep link in a browser context. That behavior is materially different from HTML analysis or rule generation and can cause users to import untrusted configuration from the network, which expands trust boundaries and creates a supply-chain style risk if any mirror, CDN, or proxy is compromised.

Intent-Code Divergence

Medium

Confidence: 82% confidence
Finding: The documentation claims login-required sites are not applicable, yet it also documents a login UI, persistence of user-entered site data via `source.putLoginInfo`, and workflow around saving that information. This mismatch can mislead users about the sensitivity of the feature and may result in credentials or session-related information being stored or reused without sufficient disclosure, which is a security and privacy concern even if not overtly malicious.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The document exposes access to a persistent device identifier via androidId(), which is not necessary for ordinary book-source parsing or selector development. In a skill that already supports network access, documenting device ID retrieval creates an unnecessary fingerprinting primitive that could be combined with requests for tracking, correlation, or covert exfiltration.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: The documentation advertises broad local file deletion and archive extraction features that exceed the stated purpose of developing book-source rules. Even as documentation, this expands the operational capability available to skill authors toward destructive filesystem actions and bulk local content access, increasing the chance of misuse or harmful automation.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: Documenting importScript(path) for network and local sources enables execution of externally supplied JavaScript, which is a powerful code-loading primitive unrelated to simple rule authoring. In this skill context, it could be used to fetch and run untrusted logic, bypass review of embedded rules, and combine with network/file APIs for data access or exfiltration.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The tool serializes user-provided book source content and uploads it to a public third-party hosting service by default. That creates an unnecessary external data transmission path outside the core skill purpose of creating/debugging sources, and may expose proprietary source rules, credentials embedded in headers/cookies, or private site integration details.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The code accepts arbitrary user-supplied upload configuration, including destination URL, headers, body fields, and content type, then performs a server-side POST to that endpoint. This enables unbounded outbound transmission and SSRF-style behavior, allowing an attacker to exfiltrate book source contents or reach internal/private network services if the tool runs in a trusted environment.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The document teaches use of cookies, credential-bearing headers, and login-related requests, but does not give a clear up-front warning about the privacy and security risks of storing, replaying, or exposing session tokens. This omission increases the chance that users will paste live secrets into source definitions or logs and misuse them on third-party services.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: Requesting login credentials without a prominent warning about sensitive-data handling is an unsafe prompt design pattern. It encourages users to disclose passwords or other secrets into an analysis workflow that is not presented as a secure credential vault, increasing the risk of accidental retention, logging, or unauthorized reuse. The skill context makes this worse because users troubleshooting anti-scraping may feel pressured to hand over credentials to make progress.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The workflow explicitly directs use of browser and HTTP MCP tools to access external sites, analyze requests, and retrieve content, but it does not instruct the agent to obtain user confirmation or warn that external requests will transmit data to third-party servers. In an agent setting, this can cause unintended network access, leak user-provided URLs or query terms, and create compliance/privacy issues, especially because the skill is designed to inspect live websites.

Missing User Warnings

Low

Confidence: 80% confidence
Finding: The document recommends storing captured HTML/page snapshots to local files for later analysis without warning that the data may contain copyrighted material, personal data, tokens, or session-specific content. Persisting fetched content increases the risk of unintended retention and secondary exposure if the workspace is shared or later reused.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The code logs partial guard token values and records challenge-related metadata in logs and workflow state. Even partial token disclosure can aid debugging abuse, leak challenge context to other components, and expose sensitive verification artifacts that should not appear in plain logs.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal