LocalDataAI

Security checks across malware telemetry and agentic risk

Overview

This skill is a local document-AI tool, but it makes strong offline, sandbox, and compliance claims that the artifacts do not actually enforce.

Review before installing, especially in regulated or confidential environments. Treat this as a local prototype rather than a verified private-computing or compliance solution: run it only in a controlled environment, manually vet downloaded models and dependencies, avoid relying on its sandbox claims, and check or disable audit/checkpoint/temp-file persistence if file paths or document metadata are sensitive.

SkillSpector

By NVIDIA

Vulnerability Patterns

Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (49)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 83% confidence
Finding: The skill advertises significant capabilities including file processing, logging, model management, and downloads, yet declares no permissions. That mismatch reduces transparency and can cause the host or user to approve a skill without understanding that it may read/write local files and access the network, which is especially sensitive for a private-data processing tool.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 91% confidence
Finding: The documentation claims purely offline, private, local-only processing, but the described behavior includes model download helpers referencing external services and other features not aligned with that claim. For a skill handling sensitive enterprise or personal documents, this mismatch is dangerous because users may trust it with confidential data under false assumptions about network isolation and implemented safeguards.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill explicitly claims full offline operation, but the installation instructions require automatically downloading models. This creates a direct contradiction that can mislead users in regulated or air-gapped environments into deploying software that unexpectedly requires network access or pulls artifacts from external sources.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The logger computes a hash by opening whatever path is passed as `file_name`, which creates unintended read access to arbitrary local files. In a local-data processing skill, that expands the component's file access beyond audit logging and can disclose file existence or content-derived metadata for sensitive paths if untrusted input reaches `log_operation`.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The code dynamically modifies sys.path to load a Python module from a sibling skill directory based on a relative filesystem path. This creates a local code-loading trust boundary violation: if that external directory or its contents are replaced, tampered with, or unexpectedly present, this skill will execute unverified code during normal operation. In a 'local private data processing' skill, that is more concerning because imported code may gain access to sensitive local documents the skill is expected to process offline.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The code presents itself as a secure, isolated sandbox that prevents data leakage, but it only creates temporary directories and then executes an arbitrary processor function in the current Python process. No OS-level filesystem isolation, network blocking, privilege reduction, or resource controls are enforced, so a processor can still read arbitrary local files, access the network, or exfiltrate data despite the security claims.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The configuration exposes security-relevant options like isolate_filesystem, restrict_network, and max_memory_mb, but none of them are applied anywhere in execution. This creates a dangerous false sense of protection: callers may enable these flags and assume isolation exists when the processor still runs without any such restrictions.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The handler writes temporary files derived from sensitive user documents to predictable paths beside the source file and only deletes them after processing. If processing crashes, deletion fails, or the directory is accessible to other local users/processes, confidential document contents may remain on disk and be exposed unexpectedly.

Missing User Warnings

Medium

Confidence: 83% confidence
Finding: Checkpoint files persist source file metadata, including file paths and hashes, to disk without an explicit user-facing control or disclosure. In a privacy-focused offline-processing skill, this creates a local data residue risk because file locations and processing history may be visible to other local users or forensic tools.

Unpinned Dependencies

Low

Category: Supply Chain
Content: # 核心依赖 torch>=2.0.0 transformers>=4.35.0 sentence-transformers>=2.2.2
Confidence: 92% confidence
Finding: torch>=2.0.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: # 核心依赖 torch>=2.0.0 transformers>=4.35.0 sentence-transformers>=2.2.2 # 文档解析
Confidence: 92% confidence
Finding: transformers>=4.35.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: # 核心依赖 torch>=2.0.0 transformers>=4.35.0 sentence-transformers>=2.2.2 # 文档解析 unstructured[all-docs]>=0.11.0
Confidence: 90% confidence
Finding: sentence-transformers>=2.2.2

Unpinned Dependencies

Low

Category: Supply Chain
Content: # 文档解析 unstructured[all-docs]>=0.11.0 pymupdf>=1.23.0 pdfplumber>=0.10.0 python-docx>=0.8.11 openpyxl>=3.1.0
Confidence: 91% confidence
Finding: pymupdf>=1.23.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: # 文档解析 unstructured[all-docs]>=0.11.0 pymupdf>=1.23.0 pdfplumber>=0.10.0 python-docx>=0.8.11 openpyxl>=3.1.0 pandas>=2.0.0
Confidence: 91% confidence
Finding: pdfplumber>=0.10.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: unstructured[all-docs]>=0.11.0 pymupdf>=1.23.0 pdfplumber>=0.10.0 python-docx>=0.8.11 openpyxl>=3.1.0 pandas>=2.0.0
Confidence: 91% confidence
Finding: python-docx>=0.8.11

Unpinned Dependencies

Low

Category: Supply Chain
Content: pymupdf>=1.23.0 pdfplumber>=0.10.0 python-docx>=0.8.11 openpyxl>=3.1.0 pandas>=2.0.0 # OCR
Confidence: 91% confidence
Finding: openpyxl>=3.1.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: pdfplumber>=0.10.0 python-docx>=0.8.11 openpyxl>=3.1.0 pandas>=2.0.0 # OCR paddlepaddle-gpu>=2.5.0; sys_platform != "darwin"
Confidence: 88% confidence
Finding: pandas>=2.0.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: # OCR paddlepaddle-gpu>=2.5.0; sys_platform != "darwin" paddlepaddle>=2.5.0; sys_platform == "darwin" paddleocr>=2.7.0 easyocr>=1.7.0 # 向量数据库
Confidence: 89% confidence
Finding: paddleocr>=2.7.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: paddlepaddle-gpu>=2.5.0; sys_platform != "darwin" paddlepaddle>=2.5.0; sys_platform == "darwin" paddleocr>=2.7.0 easyocr>=1.7.0 # 向量数据库 chromadb>=0.4.0
Confidence: 88% confidence
Finding: easyocr>=1.7.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: easyocr>=1.7.0 # 向量数据库 chromadb>=0.4.0 faiss-cpu>=1.7.4 # 文本处理
Confidence: 87% confidence
Finding: chromadb>=0.4.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: # 向量数据库 chromadb>=0.4.0 faiss-cpu>=1.7.4 # 文本处理 langchain>=0.1.0
Confidence: 87% confidence
Finding: faiss-cpu>=1.7.4

Unpinned Dependencies

Low

Category: Supply Chain
Content: faiss-cpu>=1.7.4 # 文本处理 langchain>=0.1.0 langchain-community>=0.0.10 jinja2>=3.1.0 pyyaml>=6.0.1
Confidence: 93% confidence
Finding: langchain>=0.1.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: # 文本处理 langchain>=0.1.0 langchain-community>=0.0.10 jinja2>=3.1.0 pyyaml>=6.0.1
Confidence: 93% confidence
Finding: langchain-community>=0.0.10

Unpinned Dependencies

Low

Category: Supply Chain
Content: # 文本处理 langchain>=0.1.0 langchain-community>=0.0.10 jinja2>=3.1.0 pyyaml>=6.0.1 # 编码检测
Confidence: 90% confidence
Finding: jinja2>=3.1.0

Unpinned Dependencies

Low

Category: Supply Chain
Content: langchain>=0.1.0 langchain-community>=0.0.10 jinja2>=3.1.0 pyyaml>=6.0.1 # 编码检测 chardet>=5.2.0
Confidence: 91% confidence
Finding: pyyaml>=6.0.1

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal