Security audit

Ingest Document

Security checks across malware telemetry and agentic risk

Overview

This skill appears intended to save documents into a Gitea-backed knowledge base, but it uses broad repository credentials and can write or archive user content with limited scoping controls.

Install only if you are comfortable giving this skill write access to the intended Gitea knowledge-base repositories. Use a dedicated low-privilege token limited to those repos, avoid a site-admin token, pin and audit dependencies, and make sure users know when full source files will be archived and who can access them.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration

Findings (22)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill invokes multiple local scripts, reads uploaded content, writes temporary and persisted files, and returns network-hosted Gitea URLs, but it declares no explicit permissions or capability boundaries. That creates a real security governance gap: reviewers and runtime policy engines cannot clearly constrain file, environment, or network access, increasing the chance of over-privileged execution or misuse of sensitive workspace data.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: This helper wraps site-admin-capable Gitea operations, including checking admin access and creating repositories for arbitrary users via /admin/users/{username}/repos. In a document-ingestion skill, that is excessive privilege and expands blast radius: compromise or misuse of this module could create or alter resources across other users’ namespaces, not just a constrained bot-owned area.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The script directly writes attacker-controlled content to a remote repository path and updates catalog metadata, which is broader than a narrowly scoped single-document ingest flow. Although it performs a simple name sanitization and basic replacement by exact name, it lacks stronger workflow controls such as repository scoping, content validation, authorization checks, and robust duplicate/aggregation handling, so it can be used to create or overwrite arbitrary knowledge-base pages within the target repo.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: Accepting owner and repo as unrestricted command-line inputs allows the skill to target arbitrary remote repositories reachable by the configured credentials, not just the intended personal or team knowledge base. In an agent context, this materially increases the blast radius: if the agent is induced to invoke the skill with attacker-chosen arguments, it could modify unrelated repositories, causing unauthorized content creation, overwrite, or defacement.

Description-Behavior Mismatch

High

Confidence: 94% confidence
Finding: This module manages a broad control-plane repository and central state for users, teams, chat bindings, permissions, jobs, and sources, which is far beyond the stated scope of ingesting a single document. In a skill that should handle document ingestion, hidden access to global control-plane state increases the blast radius of compromise and enables unauthorized persistence or cross-tenant manipulation if this code is reachable.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: The initialization routine provisions and maintains files for chat bindings, pending bindings, events, users, teams, and permissions even though those capabilities are not justified by the skill's declared purpose. This kind of overbroad state management can be abused to alter identity or access-control metadata and is especially dangerous because it silently creates and updates a shared system repository.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill explicitly archives the original source file into the knowledge base via `--source_file_path`, but the user-facing reply requirements never require clear notice, consent, or retention disclosure. This is dangerous because users may submit sensitive documents expecting summarization only, while the full original is retained and potentially exposed to broader personal/team access, creating privacy, confidentiality, and data-retention risk.

Env Variable Harvesting

High

Category: Data Exfiltration
Content: _load_env() GITEA_URL = os.environ.get("GITEA_URL", "").rstrip("/") ADMIN_TOKEN = os.environ.get("GITEA_ADMIN_TOKEN", "") BOT_USERNAME = os.environ.get("GITEA_BOT_USERNAME", "AIFusionBot")
Confidence: 70% confidence
Finding: os.environ.get("GITEA_ADMIN_TOKEN

Credential Access

High

Category: Privilege Escalation
Content: except ImportError: return here = Path(__file__).resolve().parent for candidate in (here / ".env", here.parent / ".env"): if candidate.exists(): load_dotenv(candidate) return
Confidence: 60% confidence
Finding: .env"

Credential Access

High

Category: Privilege Escalation
Content: except ImportError: return here = Path(__file__).resolve().parent for candidate in (here / ".env", here.parent / ".env"): if candidate.exists(): load_dotenv(candidate) return
Confidence: 60% confidence
Finding: .env"

Credential Access

High

Category: Privilege Escalation
Content: #!/usr/bin/env bash set -e python3 -m pip install -r requirements.txt if [ ! -f .env ]; then cp env-example.txt .env; fi echo "setup complete"
Confidence: 60% confidence
Finding: .env

Unpinned Dependencies

Low

Category: Supply Chain
Content: requests python-dotenv pymupdf python-docx
Confidence: 97% confidence
Finding: requests

Unpinned Dependencies

Low

Category: Supply Chain
Content: requests python-dotenv pymupdf python-docx openpyxl
Confidence: 97% confidence
Finding: python-dotenv

Unpinned Dependencies

Low

Category: Supply Chain
Content: requests python-dotenv pymupdf python-docx openpyxl xlrd
Confidence: 97% confidence
Finding: pymupdf

Unpinned Dependencies

Low

Category: Supply Chain
Content: requests python-dotenv pymupdf python-docx openpyxl xlrd
Confidence: 99% confidence
Finding: python-docx

Unpinned Dependencies

Low

Category: Supply Chain
Content: python-dotenv pymupdf python-docx openpyxl xlrd
Confidence: 99% confidence
Finding: openpyxl

Unpinned Dependencies

Low

Category: Supply Chain
Content: pymupdf python-docx openpyxl xlrd
Confidence: 94% confidence
Finding: xlrd

Known Vulnerable Dependency: requests — 10 advisory(ies): CVE-2014-1830 (Exposure of Sensitive Information to an Unauthorized Actor in Requests); CVE-2024-47081 (Requests vulnerable to .netrc credentials leak via malicious URLs); CVE-2024-35195 (Requests `Session` object does not verify requests after making first request wi) +7 more

High

Category: Supply Chain
Confidence: 95% confidence
Finding: requests

Known Vulnerable Dependency: python-dotenv — 1 advisory(ies): CVE-2026-28684 (python-dotenv: Symlink following in set_key allows arbitrary file overwrite via )

Low

Category: Supply Chain
Confidence: 84% confidence
Finding: python-dotenv

Known Vulnerable Dependency: pymupdf — 1 advisory(ies): CVE-2026-3029 (PyMuPDF has a path traversal in _main_.py)

Low

Category: Supply Chain
Confidence: 76% confidence
Finding: pymupdf

Known Vulnerable Dependency: python-docx — 2 advisory(ies): CVE-2016-5851 (Improper Restriction of XML External Entity Reference in python-docx); CVE-2016-5851 (python-docx before 0.8.6 allows context-dependent attackers to conduct XML Exter)

High

Category: Supply Chain
Confidence: 98% confidence
Finding: python-docx

Known Vulnerable Dependency: openpyxl — 2 advisory(ies): CVE-2017-5992 (Improper Restriction of XML External Entity Reference in Openpyxl); CVE-2017-5992 (Openpyxl 2.4.1 resolves external entities by default, which allows remote attack)

High

Category: Supply Chain
Confidence: 98% confidence
Finding: openpyxl

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal