Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Alibabacloud Emr Spark Manage

Manage the full lifecycle of Alibaba Cloud EMR Serverless Spark workspaces—create workspaces, submit jobs, Kyuubi interactive queries, resource queue scaling...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 9 · 0 current installs · 0 all-time installs
byalibabacloud-skills-team@sdk-team
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Name/description, file set and CLI examples all consistently describe EMR Serverless Spark workspace, job, Kyuubi and queue management. The requested operations (create/delete workspaces, submit/cancel jobs, manage Kyuubi tokens, scale queues) align with the stated purpose.
Instruction Scope
SKILL.md is a detailed instruction-only guide that instructs the agent to call Alibaba Cloud REST APIs via the aliyun CLI or SDK, operate OSS paths, manage Kyuubi tokens, and require explicit user confirmation for destructive actions. It does not instruct broad unrelated reads/writes or exfiltration to external endpoints beyond Alibaba Cloud services.
Install Mechanism
No install spec and no code files — instruction-only. This minimizes disk/execution risk; the skill assumes existing aliyun CLI or Python SDK on the host rather than installing arbitrary packages or fetching remote code.
!
Credentials
SKILL.md explicitly depends on Alibaba Cloud credentials via the 'default credential chain' (environment variables, config files, instance roles, etc.), but the skill metadata lists no required environment variables or primary credential. That mismatch means the skill will rely on credentials present in the agent environment without declaring them, which is an opacity/permission-proportionality concern. Also the docs include examples that may create tokens or operate on OSS — these operations require IAM/RAM privileges; the README lists broad 'FullAccess' policies as examples, so users must be aware of the privilege level needed.
Persistence & Privilege
The skill is not always-enabled and does notrequest persistent agent privileges. It does not attempt to modify other skills or system-wide settings. Autonomous invocation is allowed (platform default) but not combined with other elevated privileges here.
What to consider before installing
This is an instruction-only skill that documents how to manage Alibaba Cloud EMR Serverless Spark via the aliyun CLI or SDK — its behavior matches that description. Before installing or enabling it, confirm the following: - Credentials: the skill expects Alibaba Cloud credentials from the environment or config files (default credential chain). The skill metadata does not declare required env vars, so ensure you understand which profile/keys/instance role the agent will use and that you trust that environment. - Principle of least privilege: do not grant or attach a broad FullAccess/RAM policy to the agent account unless you intend full admin capabilities. Prefer the Developer or ReadOnly policies described, and scope permissions to needed resources when possible. - Destructive operations: the instructions include workspace and Kyuubi deletion and token management; they require explicit confirmation in the docs, but confirm that your agent will prompt and that you (or an admin) will approve destructive actions. - Token handling and OSS paths: creating Kyuubi tokens and uploading code to OSS are sensitive actions — ensure token values are generated and stored securely and OSS object paths are controlled. - Unknown source: the skill source is 'unknown' and contact metadata points to an internal-ish address. If this will run in a production environment, prefer skills from a trusted, identifiable publisher or vet the files and examples thoroughly. If you want to proceed safely, run the skill in a non-production account first, with a narrowly scoped IAM/RAM role and test that the agent uses the intended credential source. If you can, ask the publisher to declare the specific required env vars/credential expectations in the skill metadata to eliminate the mismatch.

Like a lobster shell, security has layers — review code before you run it.

Current versionv0.0.1
Download zip
latestvk97cptm5wsyctzr7xqxt6nn9ex83zstx

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Alibaba Cloud EMR Serverless Spark Workspace Full Lifecycle Management

Manage EMR Serverless Spark workspaces through Alibaba Cloud API. You are a Spark-savvy data engineer who not only knows how to call APIs, but also knows when to call them and what parameters to use.

Domain Knowledge

Product Architecture

EMR Serverless Spark is a fully-managed Serverless Spark service provided by Alibaba Cloud, supporting batch processing, interactive queries, and stream computing:

  • Serverless Architecture: No need to manage underlying clusters, compute resources allocated on-demand, billed by CU
  • Multi-engine Support: Supports Spark batch processing, Kyuubi (compatible with Hive/Spark JDBC), session clusters
  • Elastic Scaling: Resource queues scale on-demand, no need to reserve fixed resources

Core Concepts

ConceptDescription
WorkspaceTop-level resource container, containing resource queues, jobs, Kyuubi services, etc.
Resource QueueCompute resource pool within a workspace, allocated in CU units
CU (Compute Unit)Compute resource unit, 1 CU = 1 core CPU + 4 GiB memory
JobRunSubmission and execution of a Spark job
Kyuubi ServiceInteractive SQL gateway compatible with open-source Kyuubi, supports JDBC connections
SessionClusterLong-running interactive session environment
ReleaseVersionAvailable Spark engine versions

Job Types

TypeDescriptionApplicable Scenarios
Spark JARJava/Scala packaged JAR jobsETL, data processing pipelines
PySparkPython Spark jobsData science, machine learning
Spark SQLPure SQL jobsData analysis, report queries

Recommended Configurations

  • Development & Testing: Pay-as-you-go + 50 CU resource queue
  • Small-scale Production: 200 CU resource queue
  • Large-scale Production: 2000+ CU resource queue, elastic scaling on-demand

Prerequisites

1. Credential Configuration

Alibaba Cloud CLI/SDK will automatically obtain authentication information from the default credential chain, no need to explicitly configure credentials. Supports multiple credential sources, including configuration files, environment variables, instance roles, etc.

Recommended to use Alibaba Cloud CLI to configure credentials:

aliyun configure

For more credential configuration methods, refer to Alibaba Cloud CLI Credential Management.

2. Grant Service Roles (Required for First-time Use)

Before using EMR Serverless Spark, you need to grant the account the following two roles (see RAM Permission Policies for details):

Role NameTypeDescription
AliyunServiceRoleForEMRServerlessSparkService-linked roleEMR Serverless Spark service uses this role to access your resources in other cloud products
AliyunEMRSparkJobRunDefaultRoleJob execution roleSpark jobs use this role to access OSS, DLF and other cloud resources during execution

For first-time use, you can authorize through the EMR Serverless Spark Console with one click, or manually create in the RAM console.

3. RAM Permissions

RAM users need corresponding permissions to operate EMR Serverless Spark. For detailed permission policies, specific Action lists, and authorization commands, refer to RAM Permission Policies.

4. OSS Storage

Spark jobs typically need OSS storage for JAR packages, Python scripts, and output data:

# Check for available OSS Buckets
aliyun oss ls --user-agent AlibabaCloud-Agent-Skills

CLI/SDK Invocation

Invocation Method

All APIs are version 2023-08-08, request method is ROA style (RESTful).

# Using Alibaba Cloud CLI (ROA style)
# Important:
#   1. Must add --force --user-agent AlibabaCloud-Agent-Skills parameters, otherwise local metadata validation will report "can not find api by path" error
#   2. Recommend always adding --region parameter to specify region (GET can omit if CLI has default Region configured, but recommend explicit specification; must add if not configured, otherwise server reports MissingParameter.regionId error)
#   3. POST/PUT/DELETE write operations need to append ?regionId=cn-hangzhou at end of URL, --region alone is not enough
#      GET requests only need --region

# POST request (note URL append ?regionId=cn-hangzhou)
aliyun emr-serverless-spark POST "/api/v1/workspaces?regionId=cn-hangzhou" \
  --region cn-hangzhou \
  --header "Content-Type=application/json" \
  --body '{"workspaceName":"my-workspace","ossBucket":"oss://my-bucket","ramRoleName":"AliyunEMRSparkJobRunDefaultRole","paymentType":"PayAsYouGo","resourceSpec":{"cu":8}}' \
  --force --user-agent AlibabaCloud-Agent-Skills

# GET request (only need --region)
aliyun emr-serverless-spark GET /api/v1/workspaces --region cn-hangzhou --force --user-agent AlibabaCloud-Agent-Skills

# DELETE request (note URL append ?regionId=cn-hangzhou)
aliyun emr-serverless-spark DELETE "/api/v1/workspaces/{workspaceId}/jobRuns/{jobRunId}?regionId=cn-hangzhou" \
  --region cn-hangzhou --force --user-agent AlibabaCloud-Agent-Skills

Idempotency Rules

The following operations recommend using idempotency tokens to avoid duplicate submissions:

APIDescription
CreateWorkspaceDuplicate submission will create multiple workspaces
StartJobRunDuplicate submission will submit multiple jobs
CreateSessionClusterDuplicate submission will create multiple session clusters

Intent Routing

IntentOperationReference
Beginner / First-time useFull guidegetting-started.md
Create workspace / New SparkPlan → CreateWorkspaceworkspace-lifecycle.md
Delete workspace / DestroyDeleteWorkspaceworkspace-lifecycle.md
Query workspace / List / DetailsListWorkspacesworkspace-lifecycle.md
Submit Spark job / Run taskStartJobRunjob-management.md
Query job status / Job listGetJobRun / ListJobRunsjob-management.md
View job logsListLogContentsjob-management.md
Cancel job / Stop jobCancelJobRunjob-management.md
View CU consumptionGetCuHoursjob-management.md
Create Kyuubi serviceCreateKyuubiServicekyuubi-service.md
Start / Stop KyuubiStart/StopKyuubiServicekyuubi-service.md
Execute SQL via KyuubiConnect Kyuubi Endpointkyuubi-service.md
Manage Kyuubi TokenCreate/List/DeleteKyuubiTokenkyuubi-service.md
Scale resource queue / Not enough resourcesEditWorkspaceQueuescaling.md
View resource queueListWorkspaceQueuesscaling.md
Create session clusterCreateSessionClusterjob-management.md
Query engine versionsListReleaseVersionsapi-reference.md
Check API parametersParameter referenceapi-reference.md

Destructive Operation Protection

The following operations are irreversible. Before execution, must complete pre-check and confirm with user:

APIPre-check StepsImpact
DeleteWorkspace1. ListJobRuns to confirm no running jobs 2. ListSessionClusters to confirm no running sessions 3. ListKyuubiServices to confirm no running Kyuubi 4. User explicit confirmationPermanently delete workspace and all associated resources
CancelJobRun1. GetJobRun to confirm job status is Running 2. User explicit confirmationAbort running job, compute results may be lost
DeleteSessionCluster1. GetSessionCluster to confirm status is stopped 2. User explicit confirmationPermanently delete session cluster
DeleteKyuubiService1. GetKyuubiService to confirm status is NOT_STARTED 2. Confirm no active JDBC connections 3. User explicit confirmationPermanently delete Kyuubi service
DeleteKyuubiToken1. GetKyuubiToken to confirm Token ID 2. Confirm connections using this Token can be interrupted 3. User explicit confirmationDelete Token, connections using this Token will fail authentication
StopKyuubiService1. Remind user all active JDBC connections will be disconnected 2. User explicit confirmationAll active JDBC connections disconnected
StopSessionCluster1. Remind user session will terminate 2. User explicit confirmationSession state lost
CancelKyuubiSparkApplication1. Confirm application ID and status 2. User explicit confirmationAbort running Spark query

Confirmation template:

About to execute: <API>, target: <Resource ID>, impact: <Description>. Continue?

Security Guidelines

Job Submission Protection

Before submitting Spark jobs, must:

  1. Confirm workspace ID and resource queue
  2. Confirm code type codeType (required: JAR / PYTHON / SQL)
  3. Confirm Spark parameters and main program resource
  4. Display equivalent spark-submit command
  5. Get user explicit confirmation before submission

Timeout Control

Operation TypeTimeout Recommendation
Read-only queries30 seconds
Write operations60 seconds
Polling wait30 seconds per attempt, total not exceeding 30 minutes

Error Handling

Error CodeCauseAgent Should Execute
MissingParameter.regionIdCLI not configured with default Region and missing --region, or write operations (POST/PUT/DELETE) URL not appended with ?regionId=GET add --region (CLI with default Region configured can auto-use); write operations must append ?regionId=cn-hangzhou to URL
ThrottlingAPI rate limitingWait 5-10 seconds before retry
InvalidParameterInvalid parameterRead error Message, correct parameter
Forbidden.RAMInsufficient RAM permissionsInform user of missing permissions
OperationDeniedOperation not allowedQuery current status, inform user to wait
null (ErrorCode empty)Accessing non-existent or unauthorized workspace sub-resources (List* type APIs)Use ListWorkspaces to confirm workspace ID is correct, check RAM permissions

Related Documentation

Files

8 total
Select a file
Select a file to preview.

Comments

Loading comments…