'AI自动化测试平台'

Security checks across malware telemetry and agentic risk

Overview

The skill matches an internal AI testing platform, but it gives broad high-impact authority with weak scoping and unclear data-boundary protections.

Review before installing or deploying. Use only in a tightly isolated internal environment, replace all default credentials and secrets, require a separate admin role for auth/system/backup functions, disable query-string tokens, add owner checks on task/script/report/artifact access, redact data before DeepSeek or use a local model, and run generated Python/Playwright tests only in disposable sandboxes with minimal network and filesystem access.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (50)

Lp3

Medium
Category
MCP Least Privilege
Confidence
95% confidence
Finding
The skill describes capabilities to read/write files, access the network, and execute shell-driven workflows, yet it does not declare permissions. That creates a transparency and governance gap: operators may enable or trust the skill without understanding its effective access, which is especially risky because it can generate and run test scripts, invoke external APIs, and manipulate persistent data.

Tp4

High
Category
MCP Tool Poisoning
Confidence
82% confidence
Finding
The documented scope understates the operational and administrative behaviors exposed by the skill, including broader model management, backup operations, scheduling/executor functions, and direct auth-code administration. This mismatch can cause reviewers and deployers to underestimate the attack surface and inadvertently grant a skill far more authority than its description suggests.

Intent-Code Divergence

Medium
Confidence
94% confidence
Finding
The guide states that all data must remain internal and that no external uploads are allowed, but it also relies on the external DeepSeek API. This contradiction can lead users to send internal documents, logs, or test data off-premises under a false assumption of data locality, creating confidentiality and compliance risk.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
The deployment guide adds API gateway Lua code that performs authorization by directly querying the database, which expands the trusted computing base and introduces a second, inconsistent auth implementation outside the application. In a testing platform, embedding ad hoc auth logic in the gateway is dangerous because the sample concatenates the Authorization header into SQL, creating a likely SQL injection path at the gateway layer.

Intent-Code Divergence

High
Confidence
95% confidence
Finding
The document claims the database/gateway approach is effectively unbypassable, but the provided gateway sample only checks is_active and omits expiry, permission scope, and usage-count enforcement. This mismatch creates a false sense of security and can allow unauthorized use when operators rely on the incomplete example as a compensating control.

Description-Behavior Mismatch

High
Confidence
98% confidence
Finding
The admin API exposes authorization-code creation and verification with no visible authentication or authorization checks before performing privileged actions. Because the endpoint can mint usable auth codes for broad permissions such as 'all', an unauthenticated caller could grant themselves platform access and bypass the intended authorization model.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
The progress endpoint returns task status by task_id with no authentication or authorization check, allowing anyone who can guess or obtain a task ID to view execution state and potentially sensitive test metadata. In an internal AI-powered testing platform, task progress and results may reveal scripts, environments, endpoints, or operational details, so the lack of access control is a real information disclosure issue.

Context-Inappropriate Capability

Medium
Confidence
98% confidence
Finding
The record detail endpoint exposes execution record data solely by record_id and performs no authentication or authorization checks. This creates an insecure direct object reference pattern where unauthorized users can retrieve potentially sensitive test execution details, including internal system behavior, test data, or environment-specific information.

Context-Inappropriate Capability

Medium
Confidence
98% confidence
Finding
The task progress endpoint returns task status and result data without any authorization check, while generation endpoints tie created content to an auth_code. Because completed task results may include generated test case content, script content, and file paths, an attacker who can guess or obtain a task_id can access another user's sensitive generated artifacts. In an internal AI testing platform handling company documents and automation scripts, this becomes more dangerous because task results may expose proprietary specs, test logic, or internal paths.

Intent-Code Divergence

Medium
Confidence
93% confidence
Finding
The module-level description explicitly mentions authorization management for the platform, but the progress endpoint is implemented without any authorization enforcement. This inconsistency is security-relevant because it creates a false expectation that all related endpoints are protected, increasing the chance that sensitive generation results are exposed unnoticed. In this skill context, authorization gaps are especially risky because the service processes internal company testing materials.

Description-Behavior Mismatch

High
Confidence
97% confidence
Finding
Several script management endpoints (`get_script`, `update_script`, `delete_script`) do not require or verify any authorization, while adjacent routes rely only on a caller-supplied auth code. This enables unauthorized users to read, modify, or delete test scripts by ID, creating a direct access-control failure in a platform that claims authorization management for internal use.

Description-Behavior Mismatch

High
Confidence
96% confidence
Finding
Browser configuration and artifact retrieval routes (`configure_browser`, `get_browser_config`, `get_screenshot`, `get_trace_file`) expose sensitive operations and files without authorization checks. An attacker could alter execution settings or download screenshots/traces that may contain internal URLs, credentials, session data, or other sensitive test artifacts.

Description-Behavior Mismatch

High
Confidence
98% confidence
Finding
The middleware explicitly whitelists `/admin/add_auth` and `/admin/create_auth`, allowing unauthenticated access to endpoints that appear to manage authorization credentials. In an internal AI test platform, exposing auth-management routes publicly can let an attacker mint or alter authorization codes, leading to broad privilege escalation across test generation and execution capabilities.

Intent-Code Divergence

Medium
Confidence
94% confidence
Finding
The comment and implementation both indicate that listed paths bypass authorization, and that list includes admin endpoints for authorization management. This is not merely a documentation issue; it reflects a real insecure design that lowers protection around the most sensitive control-plane functions in the application.

Context-Inappropriate Capability

High
Confidence
99% confidence
Finding
This service takes stored script content, writes it to a temporary .py file, and executes it with pytest. Because pytest imports and runs Python code, any user able to create or modify scripts can achieve arbitrary code execution on the server, potentially leading to data theft, lateral movement, environment-secret access, or full host compromise. In an internal AI-powered testing platform, this is especially dangerous because the feature appears normalized as routine test execution.

Context-Inappropriate Capability

Medium
Confidence
89% confidence
Finding
The service returns execution record details including raw logs and, in the detail view, the stored authorization code, with no visible access-control checks or data minimization in this code path. In an internal AI testing platform, logs commonly contain request payloads, tokens, endpoints, and failure traces, so exposing them broadly can leak sensitive operational data and credentials.

Description-Behavior Mismatch

Medium
Confidence
89% confidence
Finding
The service sends raw execution logs to an external AI provider for failure analysis. Test logs commonly contain sensitive internal data such as stack traces, endpoints, tokens, usernames, file paths, or proprietary code context, so this creates a real confidentiality and data-boundary risk if users or administrators are not explicitly consenting to external sharing. In this internal company testing platform context, the risk is higher because the logs are likely to reflect non-public systems and internal test assets.

Context-Inappropriate Capability

High
Confidence
92% confidence
Finding
The service exposes database/file backup and restore primitives with no visible authorization checks, scope restrictions, path validation, or safety controls in this layer. In an AI testing platform, these are highly sensitive administrative operations; if reachable by an unintended caller, they could enable destructive restore actions, deletion of backups, or broader compromise of application data and files.

Description-Behavior Mismatch

Medium
Confidence
86% confidence
Finding
The document exposes broad administrative capabilities—AI model administration, environment management, audit logs, and backup/restore—that exceed the stated scope of an automated testing skill. Scope expansion like this increases attack surface and can enable misuse of privileged system-management functions if the skill is deployed or trusted based only on its manifest description.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
Backup and restore are powerful administrative operations unrelated to ordinary automated test execution and can affect the entire platform state. In this context, undocumented or weakly justified restore capability is dangerous because it could be abused to overwrite data, roll back security state, or access sensitive backups under the guise of a testing tool.

Missing User Warnings

Medium
Confidence
89% confidence
Finding
Capturing screenshots for every UI test can collect credentials, personal data, internal dashboards, tokens, and other sensitive visual content. Because the skill stores screenshots persistently, this expands the volume of sensitive artifacts and raises the risk of unauthorized access, leakage, or over-retention.

Missing User Warnings

High
Confidence
97% confidence
Finding
The failure-analysis prompt sends execution logs to an external AI service without warning or controls around sensitive content. Test logs often contain request headers, tokens, stack traces, PII, endpoints, and internal system details, so transmitting them externally can expose secrets and proprietary information.

Missing User Warnings

High
Confidence
99% confidence
Finding
The deployment instructions hardcode a default MySQL root password of root123 and instruct users to set the same value during installation. This creates a predictable administrative credential that is highly likely to be reused in real deployments and can lead to immediate full database compromise.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The README documents AI-based test generation and automated execution, but it does not clearly warn that uploaded documents or prompts may be sent to an external AI provider, nor that generated API/UI scripts can perform real actions against target systems. In an internal testing platform, this can lead users to expose sensitive company data to third-party services or execute unsafe generated tests without adequate review.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The helper accepts authorization credentials from the query string as well as the Authorization header. Query parameters are commonly logged by servers, proxies, browsers, monitoring tools, and referrer headers, increasing the chance of token leakage and replay by unauthorized parties.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal