一个功能强大的屏幕浏览、OCR识别和屏幕分析技能包,专为AI助手设计

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real screen capture and OCR skill, but it needs review because it can monitor sensitive screen contents and includes a Windows installer flow that downloads and silently runs an external OCR binary without integrity verification.

Install only if you trust the publisher and are comfortable granting screen-capture capability. Prefer manual, verified Tesseract installation over the automatic Windows installer, avoid running the scripts as administrator unless necessary, use explicit capture regions and output paths, and delete saved screenshots/OCR text/logs that may contain sensitive information.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
Findings (38)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
# 检查环境变量
        try:
            result = subprocess.run(
                "tesseract --version",
                shell=True,
                capture_output=True,
Confidence
94% confidence
Finding
result = subprocess.run( "tesseract --version", shell=True, capture_output=True, text=True, encoding='utf-8'

subprocess module call

Medium
Category
Dangerous Code Execution
Content
print("正在安装,请稍候...")
        
        # 运行安装程序
        result = subprocess.run(
            install_command,
            shell=True,
            capture_output=True,
Confidence
97% confidence
Finding
result = subprocess.run( install_command, shell=True, capture_output=True, text=True, encoding='utf-8' )

subprocess module call

Medium
Category
Dangerous Code Execution
Content
print(f"命令: {cmd}")
    
    try:
        result = subprocess.run(
            cmd, 
            shell=True, 
            capture_output=True,
Confidence
95% confidence
Finding
result = subprocess.run( cmd, shell=True, capture_output=True, text=True, encoding='utf-8', errors='replace'

exec() call detected

High
Category
Dangerous Code Execution
Content
# 运行测试代码
    try:
        exec(test_code)
        return True
    except Exception as e:
        print(f"❌ 验证失败: {e}")
Confidence
93% confidence
Finding
exec(test_code)

Lp3

Medium
Category
MCP Least Privilege
Confidence
93% confidence
Finding
The skill documentation advertises and instructs use of shell execution, file read/write, environment access, and network-backed installation/update flows, but declares no permissions. That creates a transparency and consent gap: a user or orchestrator may invoke a seemingly simple screen-viewing skill without realizing it can execute commands, write files, and fetch/install software. In a screenshot/OCR context, those capabilities materially increase the blast radius because sensitive screen contents may be saved, processed, or combined with downloaded components.

Tp4

High
Category
MCP Tool Poisoning
Confidence
95% confidence
Finding
The documented behavior extends beyond the stated purpose of screenshotting and OCR into dependency installation, silent downloading/installation of Tesseract and language packs, screen-state probing, image-analysis features not disclosed in the description, and packaging/publishing workflows. This mismatch is dangerous because users may consent to benign screen capture but unknowingly authorize network activity, software installation, or broader system interaction. In a privacy-sensitive skill that can access live screen content, undisclosed extra behaviors significantly raise trust and supply-chain risk.

Description-Behavior Mismatch

Medium
Confidence
90% confidence
Finding
The README explicitly advertises timed monitoring and repeated screenshot capture, which expands the skill from ad hoc screen viewing into persistent surveillance behavior. In a screen-capture skill, this increases the risk of collecting sensitive information over time without clear scope limits, consent requirements, or retention guidance.

Context-Inappropriate Capability

Medium
Confidence
83% confidence
Finding
The README recommends running an installation script that automatically installs dependencies and OCR components, introducing a code-execution and supply-chain risk beyond the core purpose of viewing/analyzing the screen. Auto-install behavior is especially risky in agent skills because users may run setup scripts with broad system privileges without reviewing what they do.

Context-Inappropriate Capability

Medium
Confidence
88% confidence
Finding
Documenting an automatic Tesseract download/install step adds external software installation capability that is not part of ordinary screenshot/OCR usage and can expose users to unreviewed binary acquisition. In the context of an agent skill, normalizing automatic installation makes it easier to conceal unexpected system modifications or introduce supply-chain compromise paths.

Context-Inappropriate Capability

Medium
Confidence
91% confidence
Finding
These examples add timed, repeated screenshot capture for monitoring over a duration, which goes beyond a simple user-invoked screenshot/OCR utility. In a screen-capture skill, this materially increases surveillance capability and can enable covert collection of sensitive on-screen data over time if reused without strong consent and retention controls.

Context-Inappropriate Capability

Medium
Confidence
94% confidence
Finding
The change-detection loop continuously watches the screen and automatically saves screenshots when changes occur, creating a persistent surveillance primitive not described in the stated scope. Because it triggers without per-capture user intent, it can silently accumulate sensitive visual data and behavior patterns.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
The automated report example collects screen resolution, mouse position, and a screenshot, then persists them into a report file. This expands the skill from screenshot/OCR into user activity and environment profiling, which is more privacy-sensitive and not justified by the declared purpose.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
This error-monitoring system repeatedly scans the screen, detects keywords, and logs findings over time, effectively implementing persistent surveillance and activity logging. In the context of a screen viewer skill, this can capture sensitive application states, internal errors, and potentially confidential text far beyond ad hoc troubleshooting.

Description-Behavior Mismatch

Medium
Confidence
89% confidence
Finding
The skill is described as a screen-viewing/OCR utility, but this script adds software installation behavior, including downloading binaries and modifying the host environment. That expands the skill's effective capability beyond passive OCR processing and increases the attack surface, especially in agent contexts where users may not expect system-level changes.

Context-Inappropriate Capability

Medium
Confidence
91% confidence
Finding
The script downloads executable content and OCR data from external sources at runtime, which introduces supply-chain risk. Because there is no integrity verification such as checksum or signature validation, a compromised mirror, redirect, or man-in-the-middle condition could result in malicious files being written and later executed.

Context-Inappropriate Capability

High
Confidence
98% confidence
Finding
This is the most security-significant behavior in the file: it executes a downloaded installer silently through the shell. In the context of a screenshot/OCR skill, this adds arbitrary software installation and code execution capability unrelated to the stated purpose, making compromise of the download path especially dangerous.

Context-Inappropriate Capability

Medium
Confidence
89% confidence
Finding
The setup script executes system shell commands to install packages, expanding its capabilities beyond simple screen viewing. This is security-relevant because install-time command execution can change the local environment and increases the blast radius if the script is run in trusted contexts.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
Automatically modifying the user's Python environment by installing third-party packages is not inherently malicious, but it is a privileged side effect unrelated to the core runtime behavior of taking screenshots. In a skill context, users may invoke or trust the skill for screen analysis without expecting environment modification.

Intent-Code Divergence

Low
Confidence
96% confidence
Finding
The 'verify installation' routine does more than validation: it reads screen size and captures a real screenshot. For a screen-capture skill this capability is aligned with purpose, but performing it automatically during setup without an explicit prompt creates an unnecessary privacy risk.

Missing User Warnings

Low
Confidence
89% confidence
Finding
The installation guide explicitly mentions granting screen-recording permissions and running as administrator, but it does not clearly warn users about the security and privacy implications of doing so. For a screen-capture skill, this omission matters because the capability can expose sensitive on-screen data and elevated privileges increase the blast radius if the skill or its dependencies are misused.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The README promotes screenshot, OCR, file output, and timed monitoring but provides no visible warning that these operations may capture credentials, personal messages, tokens, or other sensitive on-screen data. Lack of disclosure and safe-handling guidance increases the chance of accidental privacy violations and insecure storage of captured content.

Vague Triggers

Medium
Confidence
87% confidence
Finding
The trigger language is broad enough to activate on general mentions of screenshots, viewing the screen, or analyzing on-screen content, which can cause this skill to run in situations where the user did not intend screen capture or OCR. Because screen capture can expose credentials, personal messages, and other sensitive data, accidental invocation is more dangerous here than for low-risk utilities. The skill context therefore amplifies the risk of over-broad triggering.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The documentation describes capturing screens, extracting text, and saving outputs without prominently warning that screenshots and OCR may collect highly sensitive information such as passwords, personal chats, financial data, or corporate secrets. In a screen-viewing skill, omission of privacy and storage warnings can lead to unsafe handling, persistence of confidential data on disk, and unintentional disclosure. The context makes this particularly risky because the primary function directly accesses one of the most sensitive data surfaces on a user device.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
This demo captures the user's screen and writes the image to disk without a clear privacy warning, consent step, or guidance about sensitive content that may be recorded. In a screen-viewing skill, that context increases risk because screenshots can contain credentials, messages, tokens, or regulated data, and examples strongly influence downstream use.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The region screenshot demo persists captured screen content to example_region.png without disclosing that even partial screen captures may include secrets or personal data. Because this skill is specifically designed for screenshot capture and analysis, omission of privacy messaging makes accidental collection and retention more likely.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal