Cfc Disclosure Monitor

Security checks across malware telemetry and agentic risk

Overview

The skill is mostly a public disclosure scraper, but it includes undeclared hard-coded API keys and third-party OCR/LLM upload paths that users should review before installing.

Install only in an isolated workspace. Remove and rotate the embedded MiniMax/GLM keys, supply your own credentials only when you intentionally enable OCR/LLM features, and assume downloaded documents, images, extracted phone numbers, and ontology entries will persist locally. Keep target URLs limited to the intended public disclosure sites and avoid running the weakened-browser diagnostic scripts against untrusted pages.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (18)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: f.write(content) if dest.stat().st_size > max_kb * 1024: return f"[PDF too large: {dest.stat().st_size//1024}KB, skipped]", str(dest) result = subprocess.run(['pdftotext', str(dest), '-'], capture_output=True, text=True, timeout=60) return result.stdout, str(dest) except Exception as e:
Confidence: 84% confidence
Finding: result = subprocess.run(['pdftotext', str(dest), '-'], capture_output=True, text=True, timeout=60)

Tainted flow: 'req' from os.environ.get (line 56, credential/environment) → urllib.request.urlopen (network output)

Critical

Category: Data Flow
Content: }, method="POST", ) with urllib.request.urlopen(req, timeout=60) as resp: result = json.loads(resp.read()) status = result.get("base_resp", {}).get("status_code", -1)
Confidence: 94% confidence
Finding: with urllib.request.urlopen(req, timeout=60) as resp:

Intent-Code Divergence

Medium

Confidence: 85% confidence
Finding: The module description frames this as a local cleaning/evaluation phase, but the implementation also downloads remote images and sends them to an external OCR provider. This mismatch can mislead reviewers and operators, increasing the risk of undisclosed data egress and unsafe deployment assumptions.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The file includes external-network OCR capability that sends image content to MiniMax, but no manifest or surrounding skill metadata justifies that behavior. In a skill with unclear declared purpose, hidden external processing materially raises the risk of privacy violations and policy bypass.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The code downloads arbitrary image URLs derived from announcement data and stores them in /tmp for processing. If upstream data is attacker-controlled, this can be abused for unintended network access, fetching hostile content, or interacting with internal resources in SSRF-like scenarios.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The docstring describes a diagnostic scraper, but the browser is launched with flags that disable core protections such as certificate validation and web security. This mismatch matters because operators may run the script assuming low-risk page inspection while it actually visits untrusted sites in a significantly weakened browser context, increasing exposure to malicious content and making the behavior less transparent.

Context-Inappropriate Capability

Medium

Confidence: 98% confidence
Finding: The script enables a broad set of insecure Chromium flags, including disabling web security and accepting invalid certificates, while visiting multiple external websites. If one of those sites is compromised or a TLS interception occurs, the browser may load mixed or malicious content that normal protections would block, increasing the risk of data exposure or abuse of the automation environment.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The parser sends announcement text and metadata to an external LLM service, creating a data exfiltration path outside the local processing boundary. Even if announcements are often public, this code does not enforce that only public content is sent, nor does it provide consent, redaction, or policy controls before transmission.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The documentation explicitly describes automatic PDF downloading, local file creation under workspace paths, and persistent writes to a shared knowledge-graph file, but it does not warn the user or require confirmation before network access and persistence occur. In an agent skill context, silent network activity and durable writes can expose the environment to unintended data collection, storage growth, and contamination of shared memory or downstream workflows.

Missing User Warnings

High

Confidence: 99% confidence
Finding: A live-looking API key is hardcoded as the default credential, meaning anyone with code access can reuse it and any deployment lacking an environment override will silently transmit data under that credential. This creates both secret leakage risk and unauthorized third-party service consumption, and it is especially dangerous because the same code also uploads image content externally.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The OCR function base64-encodes local image content and sends it to an external API without any visible disclosure, approval gate, or data classification check. This can leak sensitive announcement images or proprietary source material to a third party, especially in environments where operators expect a local processing pipeline.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The script recursively deletes the entire output directory with shutil.rmtree when '--resume' is not used, without an explicit confirmation prompt or strong safety checks. If the computed path is broader than expected or the operator targets the wrong date/output location, this can cause destructive data loss affecting existing collected data.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The script silently makes outbound requests to a list of third-party sites using a browser configured to bypass important security safeguards, but provides only routine console output. In a skill context, this is dangerous because users may not realize the extent of network activity or that browsing occurs with weakened protections, reducing informed consent and safe operation.

Missing User Warnings

Medium

Confidence: 82% confidence
Finding: The script retrieves remote PDFs from external sites, stores them locally, and immediately processes them with an external parser. In this context, the lack of warning is less important than the actual risky behavior: untrusted document ingestion can trigger parser exploits, resource exhaustion, or processing of unexpected content.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The LLM extraction path packages company name, title, date, URL, and up to thousands of characters of announcement text for external transmission without any user-facing disclosure or consent mechanism in the code. This creates a supply-chain privacy risk because operators may not realize that local data is being exported to a third-party model provider.

Missing User Warnings

Medium

Confidence: 99% confidence
Finding: The code includes a hardcoded default GLM API key in source, which is a real secret exposure and can enable unauthorized use of the external account if the repository or skill is shared. Combining embedded credentials with automatic outbound requests materially increases the risk of account abuse, billing loss, and provider-side compromise of usage data.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The script directly prints phone numbers from the local knowledge graph when listing partner organizations, which can expose personal or sensitive business contact data to anyone with CLI access or captured logs. In this context, the dataset appears to contain cooperation and disclosure information for financial-sector entities, making accidental overexposure more sensitive than a generic directory lookup.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The skill converts potentially sensitive source documents into image files and writes extracted company/phone data to disk under a persistent results directory without any access-control, retention, minimization, or user-consent safeguards in the code path. In this context, the processor handles OCR of financial/cooperation disclosure materials and may accumulate sensitive document artifacts and extracted contact data that could be exposed to other local users, backups, logs, or downstream tooling.

VirusTotal

62/62 vendors flagged this skill as clean.

View on VirusTotal