LocalDataAI

Security checks across malware telemetry and agentic risk

Overview

This skill is a local document-AI tool, but it makes strong offline, sandbox, and compliance claims that the artifacts do not actually enforce.

Review before installing, especially in regulated or confidential environments. Treat this as a local prototype rather than a verified private-computing or compliance solution: run it only in a controlled environment, manually vet downloaded models and dependencies, avoid relying on its sandbox claims, and check or disable audit/checkpoint/temp-file persistence if file paths or document metadata are sensitive.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (49)

Lp3

Medium
Category
MCP Least Privilege
Confidence
83% confidence
Finding
The skill advertises significant capabilities including file processing, logging, model management, and downloads, yet declares no permissions. That mismatch reduces transparency and can cause the host or user to approve a skill without understanding that it may read/write local files and access the network, which is especially sensitive for a private-data processing tool.

Tp4

High
Category
MCP Tool Poisoning
Confidence
91% confidence
Finding
The documentation claims purely offline, private, local-only processing, but the described behavior includes model download helpers referencing external services and other features not aligned with that claim. For a skill handling sensitive enterprise or personal documents, this mismatch is dangerous because users may trust it with confidential data under false assumptions about network isolation and implemented safeguards.

Description-Behavior Mismatch

Medium
Confidence
95% confidence
Finding
The skill explicitly claims full offline operation, but the installation instructions require automatically downloading models. This creates a direct contradiction that can mislead users in regulated or air-gapped environments into deploying software that unexpectedly requires network access or pulls artifacts from external sources.

Description-Behavior Mismatch

Medium
Confidence
91% confidence
Finding
The logger computes a hash by opening whatever path is passed as `file_name`, which creates unintended read access to arbitrary local files. In a local-data processing skill, that expands the component's file access beyond audit logging and can disclose file existence or content-derived metadata for sensitive paths if untrusted input reaches `log_operation`.

Context-Inappropriate Capability

Medium
Confidence
94% confidence
Finding
The code dynamically modifies sys.path to load a Python module from a sibling skill directory based on a relative filesystem path. This creates a local code-loading trust boundary violation: if that external directory or its contents are replaced, tampered with, or unexpectedly present, this skill will execute unverified code during normal operation. In a 'local private data processing' skill, that is more concerning because imported code may gain access to sensitive local documents the skill is expected to process offline.

Intent-Code Divergence

High
Confidence
99% confidence
Finding
The code presents itself as a secure, isolated sandbox that prevents data leakage, but it only creates temporary directories and then executes an arbitrary processor function in the current Python process. No OS-level filesystem isolation, network blocking, privilege reduction, or resource controls are enforced, so a processor can still read arbitrary local files, access the network, or exfiltrate data despite the security claims.

Intent-Code Divergence

Medium
Confidence
98% confidence
Finding
The configuration exposes security-relevant options like isolate_filesystem, restrict_network, and max_memory_mb, but none of them are applied anywhere in execution. This creates a dangerous false sense of protection: callers may enable these flags and assume isolation exists when the processor still runs without any such restrictions.

Missing User Warnings

Medium
Confidence
87% confidence
Finding
The handler writes temporary files derived from sensitive user documents to predictable paths beside the source file and only deletes them after processing. If processing crashes, deletion fails, or the directory is accessible to other local users/processes, confidential document contents may remain on disk and be exposed unexpectedly.

Missing User Warnings

Medium
Confidence
83% confidence
Finding
Checkpoint files persist source file metadata, including file paths and hashes, to disk without an explicit user-facing control or disclosure. In a privacy-focused offline-processing skill, this creates a local data residue risk because file locations and processing history may be visible to other local users or forensic tools.

Unpinned Dependencies

Low
Category
Supply Chain
Content
# 核心依赖
torch>=2.0.0
transformers>=4.35.0
sentence-transformers>=2.2.2
Confidence
92% confidence
Finding
torch>=2.0.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
# 核心依赖
torch>=2.0.0
transformers>=4.35.0
sentence-transformers>=2.2.2

# 文档解析
Confidence
92% confidence
Finding
transformers>=4.35.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
# 核心依赖
torch>=2.0.0
transformers>=4.35.0
sentence-transformers>=2.2.2

# 文档解析
unstructured[all-docs]>=0.11.0
Confidence
90% confidence
Finding
sentence-transformers>=2.2.2

Unpinned Dependencies

Low
Category
Supply Chain
Content
# 文档解析
unstructured[all-docs]>=0.11.0
pymupdf>=1.23.0
pdfplumber>=0.10.0
python-docx>=0.8.11
openpyxl>=3.1.0
Confidence
91% confidence
Finding
pymupdf>=1.23.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
# 文档解析
unstructured[all-docs]>=0.11.0
pymupdf>=1.23.0
pdfplumber>=0.10.0
python-docx>=0.8.11
openpyxl>=3.1.0
pandas>=2.0.0
Confidence
91% confidence
Finding
pdfplumber>=0.10.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
unstructured[all-docs]>=0.11.0
pymupdf>=1.23.0
pdfplumber>=0.10.0
python-docx>=0.8.11
openpyxl>=3.1.0
pandas>=2.0.0
Confidence
91% confidence
Finding
python-docx>=0.8.11

Unpinned Dependencies

Low
Category
Supply Chain
Content
pymupdf>=1.23.0
pdfplumber>=0.10.0
python-docx>=0.8.11
openpyxl>=3.1.0
pandas>=2.0.0

# OCR
Confidence
91% confidence
Finding
openpyxl>=3.1.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
pdfplumber>=0.10.0
python-docx>=0.8.11
openpyxl>=3.1.0
pandas>=2.0.0

# OCR
paddlepaddle-gpu>=2.5.0; sys_platform != "darwin"
Confidence
88% confidence
Finding
pandas>=2.0.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
# OCR
paddlepaddle-gpu>=2.5.0; sys_platform != "darwin"
paddlepaddle>=2.5.0; sys_platform == "darwin"
paddleocr>=2.7.0
easyocr>=1.7.0

# 向量数据库
Confidence
89% confidence
Finding
paddleocr>=2.7.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
paddlepaddle-gpu>=2.5.0; sys_platform != "darwin"
paddlepaddle>=2.5.0; sys_platform == "darwin"
paddleocr>=2.7.0
easyocr>=1.7.0

# 向量数据库
chromadb>=0.4.0
Confidence
88% confidence
Finding
easyocr>=1.7.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
easyocr>=1.7.0

# 向量数据库
chromadb>=0.4.0
faiss-cpu>=1.7.4

# 文本处理
Confidence
87% confidence
Finding
chromadb>=0.4.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
# 向量数据库
chromadb>=0.4.0
faiss-cpu>=1.7.4

# 文本处理
langchain>=0.1.0
Confidence
87% confidence
Finding
faiss-cpu>=1.7.4

Unpinned Dependencies

Low
Category
Supply Chain
Content
faiss-cpu>=1.7.4

# 文本处理
langchain>=0.1.0
langchain-community>=0.0.10
jinja2>=3.1.0
pyyaml>=6.0.1
Confidence
93% confidence
Finding
langchain>=0.1.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
# 文本处理
langchain>=0.1.0
langchain-community>=0.0.10
jinja2>=3.1.0
pyyaml>=6.0.1
Confidence
93% confidence
Finding
langchain-community>=0.0.10

Unpinned Dependencies

Low
Category
Supply Chain
Content
# 文本处理
langchain>=0.1.0
langchain-community>=0.0.10
jinja2>=3.1.0
pyyaml>=6.0.1

# 编码检测
Confidence
90% confidence
Finding
jinja2>=3.1.0

Unpinned Dependencies

Low
Category
Supply Chain
Content
langchain>=0.1.0
langchain-community>=0.0.10
jinja2>=3.1.0
pyyaml>=6.0.1

# 编码检测
chardet>=5.2.0
Confidence
91% confidence
Finding
pyyaml>=6.0.1

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal