Byted EMR Skills

Security checks across malware telemetry and agentic risk

Overview

This skill is a real EMR administration tool, but it gives an agent broad cloud-control and credential-handling power with weak safety gates.

Install only if you intentionally want an agent to administer real Volcengine EMR resources. Use least-privilege and preferably temporary credentials, verify the SDK wheel source, avoid putting secrets in command lines or job configs, and require human confirmation before restarts, deletes, privilege grants, password changes, config updates, resource changes, job submissions, or report/history retrieval.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (15)

Context-Inappropriate Capability

Medium
Confidence
91% confidence
Finding
The function is a generic wrapper over the EMR OpenAPI and forwards a caller-controlled `action` directly to the backend without any local allowlist or role restriction. In a skill context, this broadens the callable surface from a narrowly scoped 'agent manager' into an arbitrary EMR operation proxy, which can enable unauthorized state changes, information disclosure, or destructive actions if higher layers fail to constrain inputs.

Vague Triggers

Medium
Confidence
84% confidence
Finding
The trigger language is overly broad, including instructions to immediately invoke the skill for any similar EMR-related request and for any question involving serverless jobs, queues, compute groups, or diagnosis. Because this skill can perform real administrative operations, broad matching can cause the agent to activate it for loosely related or ambiguous prompts and take privileged actions in the wrong context.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The capability list includes sensitive and destructive actions such as restarting services, modifying configurations, creating users, changing passwords, granting queue privileges, stopping or deleting compute groups, canceling jobs, and updating cluster attributes, but it provides no safety warnings or operator confirmation requirements. In infrastructure-management context, omission of these warnings makes accidental service disruption, privilege changes, and security-impacting misconfiguration more likely.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The setup section tells users to export access keys and secret keys directly into environment variables and install SDK tooling, but it does not warn about secure secret handling, shell history leakage, process/environment exposure, or the need to avoid logging credentials. Since the skill also uses shell, files, and network access, weak credential guidance increases the chance of credential theft or accidental disclosure.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The guide instructs operators to list chats, fetch chat history, list reports, and retrieve report details, but it does not warn that these outputs may contain sensitive operational metadata, user prompts, cluster/job identifiers, and historical diagnostic context. In an agent skill context, this omission can lead the assistant to over-collect or expose prior-session data beyond the minimum needed for the user’s request, increasing privacy and data-leakage risk.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The document exposes a state-changing administrative API that can install, stop, restart, uninstall, rebalance, or decommission EMR application components, but it provides no safety guidance, precondition checks, or warnings about outage risk and irreversibility. In an agent skill context, this increases the chance that an automated assistant will invoke disruptive actions directly from a user prompt without adequate confirmation or impact assessment, potentially causing service interruption or cluster instability.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The document includes operational APIs that mutate live EMR resources, such as disk scaling and ECS spec changes, but does not warn users about service disruption, restart risk, irreversibility constraints, or billing impact. In an agent skill context, omission of these safeguards can cause the agent or user to trigger expensive or availability-impacting actions without informed confirmation.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The documentation shows passwords being passed directly on the command line and embedded in request examples without any warning about shell history, process-list exposure, logging, or transcript capture. In an EMR administration skill, these examples are likely to be copied into real operational environments, increasing the chance that cluster credentials are exposed to other users, CI logs, terminal history, or monitoring systems.

Missing User Warnings

Medium
Confidence
98% confidence
Finding
The password rotation example includes both old and new passwords in plaintext CLI/body examples, which doubles the exposure risk and can leak valid credentials during a sensitive operation. Because this skill manages EMR cluster users, disclosure could enable unauthorized access, persistence, or privilege abuse across a data platform environment.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The guide documents operationally disruptive actions such as UpdateConfig, RebootApplications, and RebootComponentInstance without any caution about service interruption, workload impact, or the fact that some configuration changes may require restart and can affect production jobs. In an agent skill context, this omission increases the chance that an automated assistant will execute high-impact changes directly from user prompts, causing avoidable outages or degraded cluster behavior.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The document exposes fields intended to carry sensitive values such as custom image repository passwords and GCS Redis passwords, but provides no warning against placing raw secrets directly in request bodies, examples, logs, or version-controlled templates. In an agent skill context that helps users construct and submit API payloads, this can normalize insecure secret handling and lead to credential disclosure through command history, chat transcripts, audit logs, or stored job definitions.

Missing User Warnings

Medium
Confidence
86% confidence
Finding
The guide tells users that local files may be automatically uploaded to TOS during job submission, but it does not clearly warn that local code or data will leave the local environment and be stored remotely. In an EMR skill context, this can cause unintended disclosure of sensitive scripts, jars, dependencies, or embedded data, especially if users assume a local-path submission remains local.

Missing User Warnings

High
Confidence
97% confidence
Finding
The document states that VOLCENGINE_AK/VOLCENGINE_SK are automatically injected into Spark configuration, which is dangerous because Spark conf values can be exposed through job metadata, logs, UIs, debugging output, or downstream code. In a serverless job environment, this materially increases the risk of credential leakage and subsequent unauthorized access to cloud resources.

Missing User Warnings

High
Confidence
97% confidence
Finding
The PySpark instructions repeat the same unsafe pattern of automatically placing AK/SK into job configuration, where secrets may be visible to the application, operators, logs, or monitoring surfaces. Because PySpark code can easily introspect environment and config at runtime, this increases the chance of accidental exposure or deliberate exfiltration by submitted code.

Missing User Warnings

Medium
Confidence
84% confidence
Finding
The script automatically copies VOLCENGINE_AK and VOLCENGINE_SK from the local environment into the submitted Spark job configuration. In an EMR/serverless context, job configurations are often persisted, viewable in consoles/logs, or accessible to downstream job code, so this can unintentionally disclose long-lived cloud credentials beyond the local execution boundary.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal