PAI-EAS Service Deployment
⚠️ TOP RULES (read first)
1. 🔴 NO DUPLICATE SERVICE NAMES 🔴
If a service with the target name already exists: STOP and inform
the user. Do NOT delete and recreate. Do NOT reuse it either.
2. Mandatory API Calls — Execute ALL of these in order:
| # | API | CLI | Purpose |
|---|
| 1 | ListImages | aliyun aiworkspace list-images | Validate image |
| 2 | describe-machine-spec | aliyun eas describe-machine-spec | Validate GPU type |
| 3 | create-service | aliyun eas create-service | Create service |
| 4 | describe-service | aliyun eas describe-service | Check status (once) |
| 5 | describe-service-endpoints | aliyun eas describe-service-endpoints | Get endpoints |
Execute #1 and #2 ALWAYS, even if user provided the info.
describe-machine-spec ≠ list-resources. describe-service ≠ ListServices.
3. Prohibited — ❌ Reuse existing service
❌ Write bash scripts (run CLI directly)
❌ CPU+vLLM/SGLang ❌ file:// in create-service
❌ Skip mandatory APIs ❌ Change the service name the user specified
❌ Poll describe-service in a loop (call once only)
4. Autonomous Execution — Do NOT ask user for info discoverable
via APIs. Do NOT ask "should I proceed?" Execute directly.
Timeout? Retry with --read-timeout 60. Error? Inform user and CONTINUE.
Missing param? Pick reasonable default.
If any pre-check or resource discovery step fails, log the failure
and continue to the next step. Only STOP for the specific conditions
listed in Self-Verify Checkpoints (duplicate service name, missing
NLB/GW/dedicated resource group).
5. Self-Verify Checkpoints:
Before Step 2: Did Step 1.5 confirm no duplicate service name?
If duplicate → STOP, inform user, do NOT proceed.
Before Step 5: Have I run list-images AND describe-machine-spec?
If NO → STOP and run them NOW.
Before Step 6: Did Step 1.5 confirm no duplicate service name?
If duplicate exists → STOP, inform user, do NOT proceed.
Step 4 resources: If NLB/GW/Resource Group not found → inform user and STOP.
Do NOT block or attempt workarounds.
After Step 7: Did I call describe-service once and report the status?
6. Run CLI commands DIRECTLY — Use execute_shell_command to run
each aliyun CLI command one at a time. Do NOT write bash scripts,
deployment scripts, or shell files. Do NOT use retry_command()
or any wrapper functions. Each command = one direct execution.
Pre-checks
aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent \
--user-agent "AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy"
aliyun plugin update
aliyun configure set --auto-plugin-install true
⚠️ MANDATORY: Install and update required plugins BEFORE proceeding:
aliyun plugin update
aliyun plugin install aiworkspace --force
aliyun plugin install eas --force
Verify plugins are installed:
aliyun plugin list 2>/dev/null | grep -E "aiworkspace|eas"
If either plugin is missing after install, retry with:
aliyun plugin install <plugin-name> --force
See CLI Guide.
Never read/echo AK/SK. Timeout? Retry with --read-timeout 60.
⚠️ MANDATORY CHECKPOINT: After pre-checks, you MUST proceed to
Step 1 regardless of any warnings during plugin installation.
Plugin install failures are non-blocking — the plugins may already
be installed. NEVER stop execution at the pre-check stage.
Step 1: Collect Parameters
| Param | Required | Default |
|---|
| RegionId | Yes | - |
| Service Name | Yes | - |
| Workspace ID | No | Auto-select |
| Image URI | Yes | - |
| Instance Type | Yes | - |
| Replicas | No | 1 |
| Port | No | 8000 |
| OSS Path | No | - |
Service name: lowercase/digits/underscores only. No hyphens. 3-63 chars.
IMPORTANT: Use the EXACT service name the user specifies. Do NOT rename.
If the user specifies a prefix (e.g. "skill_qwen_开头"), generate a random suffix of 6 digits (e.g. skill_qwen_482917).
Set profile region — Set the CLI profile region to match the
deployment region. This avoids "Region mismatch" errors when
--cluster-id differs from the profile's default region:
aliyun configure set --region <region>
Workspace ID: Required in metadata.workspace_id. If user does not
specify a workspace, query available workspaces and pick one:
aliyun aiworkspace list-workspaces --region <region> \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.Workspaces[] | select(.Status == "ENABLED") | {WorkspaceId, WorkspaceName}'
If multiple workspaces exist, list them and let the user choose.
If only one exists, use it directly.
Step 1.5: Check for Duplicate Service Name
aliyun eas list-services --region <region> --cluster-id <region> \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.ServiceList[] | select(.ServiceName == "<name>") | {ServiceName, Status}'
If a service with the same name already exists → STOP and inform
the user: "A service named <name> already exists (Status: <status>).
Please choose a different name." Do NOT delete or reuse it.
If no duplicate → proceed to Step 2.
Step 2: ListImages (🚧 BLOCKING GATE — NEVER SKIP)
Execute even if user provided image URI. Purpose = VALIDATION.
⚠️ If you see "parse error" or "Exit Code 4", the plugin
failed to install. You MUST retry with explicit install:
aliyun plugin install aiworkspace --force
Then retry the list-images command. Do NOT skip this step.
aliyun aiworkspace list-images --region <region> --verbose true \
--labels 'system.official=true,system.supported.eas=true' \
--page-size 50 --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.Images[] | {ImageId, ImageUri, ImageCategory}' > /tmp/eas_images.json
echo "Saved $(jq length /tmp/eas_images.json) images"
If still fails after plugin reinstall, retry with timeout:
aliyun aiworkspace list-images --region <region> --verbose true \
--labels 'system.official=true,system.supported.eas=true' \
--page-size 50 --read-timeout 60 \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.Images[] | {ImageId, ImageUri, ImageCategory}' > /tmp/eas_images.json
echo "Saved $(jq length /tmp/eas_images.json) images"
See Model-Image.
This output is REQUIRED by Step 5. Do NOT skip.
❌ NEVER skip this step saying "user already provided the image".
Step 3: describe-machine-spec (🚧 BLOCKING GATE — NEVER SKIP)
Execute even if user specified instance type. NOT list-resources.
⚠️ If this command fails, retry with --read-timeout 60.
NEVER proceed to Step 5 without /tmp/eas_specs.json existing.
aliyun eas describe-machine-spec --region <region> \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.InstanceTypes[] | select(.GPU > 0) | {InstanceType, CPU, GPU, Memory}' \
> /tmp/eas_specs.json
echo "Available GPU specs:"; cat /tmp/eas_specs.json
CPU→GPU auto-switch: If vLLM/SGLang + CPU, silently pick GPU. Do NOT ask.
This output is REQUIRED by Step 5. Do NOT skip.
Step 4: Network & Resource Config
| Type | VPC | Config |
|---|
| Shared | No | (default, no networking fields) |
| Dedicated GW | Yes | networking.gateway + cloud.networking |
| NLB | Yes | networking.nlb + cloud.networking |
⚠️ If a required resource does not exist → STOP and inform the user.
Do NOT block or attempt workarounds. This is a valid outcome.
Dedicated Gateway — Call list-gateway. If no gateway exists →
inform user and STOP. Otherwise call describe-gateway to get
VPC/VSwitch, then query security group under that VPC.
If no security group found → inform user and STOP.
aliyun eas list-gateway --region <region> \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy
If gateway found, get details:
aliyun eas describe-gateway --region <region> --cluster-id <region> \
--gateway-id <gateway_id> \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy
Extract VPC and comma-separated VSwitch ID:
aliyun eas describe-gateway --region <region> --cluster-id <region> \
--gateway-id <gateway_id> \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '{vpc_id: .LoadBalancerList[0].VpcId, vswitch_id: (.LoadBalancerList[0].VSwitchIds | join(","))}'
NLB — Requires VPC/VSwitch/SecurityGroup. If user does not provide
them, query via APIs. If any required resource not found → inform
user and STOP.
⚠️ NLB requires ≥2 VSwitches across different availability zones.
Use comma-separated format: "vswitch_id": "vsw-zone-a,vsw-zone-b".
⚠️ NLB Plugin Bug (aliyun-cli-eas v0.2.0): If create-service with
NLB config returns 400 with 'vswitch can not be null' or
'vpcId, vswId and securityGroupId are required', this is a known
CLI plugin bug (not a resource issue). Fallback strategy:
- Retry create-service with NLB config once more (max 2 attempts).
- If both fail → Remove
networking.nlb and cloud.networking from
service.json, redeploy with shared gateway.
- Inform user: "NLB config failed due to CLI plugin limitation.
Deployed with shared gateway instead."
EAS Dedicated Resource Group — Call list-resources.
Filter for ResourceType == "Dedicated" and Status == "ResourceReady".
aliyun eas list-resources --region <region> \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.Resources[] | select(.ResourceType == "Dedicated" and .Status == "ResourceReady") | {ResourceId, ResourceType, Status}'
- If exists → Set
"metadata": {"resource": "<ResourceId>"}.
Do NOT set cloud.computing.
- If NOT exists → Inform the user and STOP.
Do NOT fall back to public resource group.
Step 5: Build Service JSON
⚠️ BEFORE building JSON, you MUST read these reference files:
references/config-patterns.md — Complete JSON templates for all 8 patterns
references/config-schema.md — Field descriptions and validation rules
references/storage-mount.md — OSS/NAS mount configuration details
references/network-config.md — NLB/Gateway network configuration details
⚠️ HARD GATE: Before writing service.json, VERIFY these files
exist and have content. If either is missing → STOP and run
that Step NOW.
test -s /tmp/eas_images.json || echo "MISSING: Run Step 2 NOW"
test -s /tmp/eas_specs.json || echo "MISSING: Run Step 3 NOW"
⚠️ JSON format rules:
- Allowed top-level keys:
metadata, containers, storage, cloud, autoscaler, networking
- ❌ NEVER use as top-level keys:
spec, ServiceName, Image, Cpu, Memory, Gpu, processor_path, resourceGroupId, instance, port, command, access
- ❌ FORBIDDEN fields:
processor_path, resourceGroupId, spec, access
metadata.name = service name, metadata.workspace_id = workspace (REQUIRED)
containers[].image = image URI, containers[].command = start command, containers[].port = port
cloud.computing.instance_type = instance type (MANDATORY for shared gateway)
Quick Reference — JSON Skeletons
Below are minimal skeletons. Read references/config-patterns.md for
complete templates with all fields and examples.
Base (Shared Gateway):
{"metadata":{"name":"<name>","instance":1,"workspace_id":"<ws>"},
"containers":[{"image":"<img>","port":<p>,"command":"<cmd>"}],
"cloud":{"computing":{"instance_type":"<type>"}}}
+ OSS → add "storage":[{"mount_path":"/dir","oss":{"path":"oss://<b>/<p>/","readOnly":true}}]
+ Autoscaling → add "autoscaler":{"min":1,"max":4,"scaleStrategies":[{"metricName":"qps","threshold":20}]}
+ Health Check → add startup_check to containers[] (see config-patterns.md Pattern 4)
NLB — full template (read references/network-config.md for details):
{"metadata":{"name":"<name>","instance":1,"workspace_id":"<ws>"},
"containers":[{"image":"<img>","port":<p>,"command":"<cmd>"}],
"cloud":{"computing":{"instance_type":"<type>"},
"networking":{"vpc_id":"<vpc>","vswitch_id":"<vsw1>,<vsw2>","security_group_id":"<sg>"}},
"networking":{"nlb":[{"id":"default","listener_port":<p>,"netType":"intranet"}]}}
⚠️ vswitch_id must be comma-separated with ≥2 VSwitches across different zones
Dedicated Resource Group — "metadata.resource" instead of cloud.computing:
{"metadata":{"name":"<name>","instance":1,"resource":"<res_id>","workspace_id":"<ws>"},
"containers":[{"image":"<img>","port":<p>,"command":"<cmd>"}]}
Dedicated Gateway — networking.gateway + cloud.networking:
{"metadata":{"name":"<name>","instance":1,"workspace_id":"<ws>"},
"containers":[{"image":"<img>","port":<p>,"command":"<cmd>"}],
"networking":{"gateway":"<gw_id>"},
"cloud":{"computing":{"instance_type":"<type>"},
"networking":{"vpc_id":"<vpc>","vswitch_id":"<vsw1>,<vsw2>","security_group_id":"<sg>"}}}
⚠️ vswitch_id comma-separated if gateway returns multiple VSwitches
Validate Before Writing
jq -r '.[] | select(.ImageUri | contains("vllm")) | .ImageUri' /tmp/eas_images.json
jq -r '.[] | select(.InstanceType == "<type>") | .InstanceType' /tmp/eas_specs.json
Step 6: Create Service (MANDATORY)
🔴 CONFIRM: Did Step 1.5 confirm no duplicate service name?
If a service with this name already exists → STOP. Inform the user
and do NOT proceed with create-service.
Use $(cat service.json) NOT file://service.json.
Run this DIRECTLY via execute_shell_command, do NOT write a bash script.
aliyun eas create-service --region <region> \
--body "$(cat service.json)" \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy
409 Conflict → Service already exists. Inform the user and STOP.
400 BadRequest with 'vswitch can not be null' or
'vpcId, vswId and securityGroupId are required' → NLB CLI plugin
bug (see Step 4 fallback). Remove networking.nlb and
cloud.networking from service.json and retry.
Step 7: Verify Deployment
Call describe-service ONCE to check the current status. Do NOT poll.
Do NOT loop. Do NOT wait for Running.
aliyun eas describe-service --region <region> --cluster-id <region> \
--service-name <name> \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '{Status, ServiceName, ServiceId}'
Report whatever status you get (Running, Waiting, Creating, etc.)
and proceed to Step 8 immediately. create-service returning 200 = success.
Step 8: Report Result (MANDATORY)
Get endpoint info via DescribeServiceEndpoint:
aliyun eas describe-service-endpoints --region <region> --cluster-id <region> \
--service-name <name> --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '{AccessToken, Endpoints: [.Endpoints[] | {
Type: .EndpointType, Port: .Port,
InternetEndpoints: .InternetEndpoints,
IntranetEndpoints: .IntranetEndpoints
}]}'
Use the status from Step 7 and the endpoints above to report.
Copy the ENTIRE output into your final response. Format:
Deployment Summary
==================
Service Name: <name>
Status: <from Step 7>
Endpoints:
- <EndpointType>:
InternetEndpoint: <url or null>
IntranetEndpoint: <url or null>
Port: <port or 0>
Service Invocation Examples:
curl <internet-endpoint>/api/predict/<name> \
-H "Authorization: <AccessToken>"
curl <intranet-endpoint>/api/predict/<name> \
-H "Authorization: <AccessToken>"
curl <nlb-domain>:<listener_port>/api/predict/<name> \
-H "Authorization: <AccessToken>"
InternetEndpoint and IntranetEndpoint MUST appear in your
response, even if null. If null: (not available for this network type)
Always include a service invocation example using the AccessToken
and endpoint URL.
Success criteria: create-service returning 200 with ServiceId =
success. Any status (Running, Waiting, Creating) is acceptable.
When done, disable AI-Mode: aliyun configure ai-mode disable
References (read when needed)