Eval Repo

Security checks across malware telemetry and agentic risk

Overview

The skill appears purpose-built for evaluating GitHub repositories, but it needs broad Gitea administrator access, writes persistent knowledge-base and Feishu records, and ships risky default configuration for that authority.

Install only if you control the configured Gitea server and are comfortable giving this skill a site-admin token. Prefer an HTTPS Gitea URL, a least-privilege token if the system supports it, pinned dependencies, and an explicit operating rule that storage and Feishu sync happen only after user confirmation.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (12)

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: This script is described and named as a read/query tool, but the --list path can also perform a write via kb.append_query_log(username, args.log_question). That hidden side effect breaks least astonishment and can cause unintended persistence of user-provided data, which is especially risky in agent workflows that may treat read-only tools as safe to call more freely.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill explicitly persists generated evaluations into the user's Gitea knowledge base and may also sync metadata into a Feishu table, but it does not require a clear, informed user confirmation about these side effects at the point of execution. This can cause unintended retention and broader sharing of repository links, assessments, and user-context-derived analysis, especially when the user only asked for an evaluation rather than durable storage or external sync.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The manifest describes a high-privilege workflow that can fetch external repositories, evaluate them against user data, and write results into a knowledge base and Feishu table, but it does not define clear activation boundaries or user-consent constraints. In this context, broad scope increases the chance of over-collection, unintended invocation, or misuse of the required admin token and user context during repository ingestion and evaluation.

Natural-Language Policy Violations

Low

Confidence: 76% confidence
Finding: The manifest description is Chinese-only and does not indicate that language behavior is selected based on user preference or locale. This is primarily a safety/UX issue, but in a skill that performs repository evaluation and persistence, forced language can cause users to misunderstand what data will be collected, how conclusions are derived, or when results will be written to external systems.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The function records user questions to a persistent repository log automatically, and the comment indicates this is intended to happen for every query. User questions can contain sensitive data, internal research topics, credentials pasted by mistake, or personal information, so silently persisting them creates a privacy and data-retention risk that can expose users if the repo is shared, leaked, or broadly accessible.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: User-controlled query text is written to log.md without any warning, consent flow, or minimization visible in this file. This creates a privacy and data-governance issue because users may include sensitive information in natural-language queries, and a seemingly harmless listing operation can permanently store that content.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The inline instruction explicitly states that query logging should always occur, reinforcing a design choice of mandatory persistence of user queries. In a knowledge-base context, queries are likely to reflect confidential research interests or sensitive text snippets, so mandatory logging increases privacy risk and can enable unnecessary data accumulation and later disclosure.

Ssd 3

Medium

Confidence: 89% confidence
Finding: The script explicitly creates a data-collection path for free-form user input by recording log_question during listing. In the context of a knowledge-base read tool, this is more dangerous because operators may assume the command is read-only and may pass sensitive research questions or identifiers that then become stored content in the repository.

Unpinned Dependencies

Low

Category: Supply Chain
Content: requests>=2.28 python-dotenv>=1.0
Confidence: 95% confidence
Finding: requests>=2.28

Unpinned Dependencies

Low

Category: Supply Chain
Content: requests>=2.28 python-dotenv>=1.0
Confidence: 93% confidence
Finding: python-dotenv>=1.0

Known Vulnerable Dependency: requests — 10 advisory(ies): CVE-2014-1830 (Exposure of Sensitive Information to an Unauthorized Actor in Requests); CVE-2024-47081 (Requests vulnerable to .netrc credentials leak via malicious URLs); CVE-2024-35195 (Requests `Session` object does not verify requests after making first request wi) +7 more

High

Category: Supply Chain
Confidence: 89% confidence
Finding: requests

Known Vulnerable Dependency: python-dotenv — 1 advisory(ies): CVE-2026-28684 (python-dotenv: Symlink following in set_key allows arbitrary file overwrite via )

Low

Category: Supply Chain
Confidence: 74% confidence
Finding: python-dotenv

VirusTotal

61/61 vendors flagged this skill as clean.

View on VirusTotal