huawei-cloud-sac-yolo

API key required
Dev Tools

"Deploy YOLO training platform on Huawei Cloud with GPU ECS via Terraform. Use when building or managing a YOLO GPU training environment. Trigger: deploy YOLO, YOLO training, GPU training, 部署YOLO, YOLO训练, GPU训练, 视觉模型训练"

Install

openclaw skills install huawei-cloud-sac-yolo

Huawei Cloud YOLO Training Platform

Overview

Deploy the "Quickly Build YOLO Visual Model Training Platform" solution end-to-end on Huawei Cloud. The platform provides GPU-accelerated ECS for YOLO model training, with full infrastructure provisioning via Terraform.

Architecture: ECS (GPU, P2s/Pi2) and VPC and Subnet and Security Group (ICMP/SSH/HTTP) and EIP (300 Mbit/s) and EVS (100 GB system + 500 GB data) and CBR (backup vault + policy). Cloud-init installs Docker and launches the YOLO container on GPU.

Tool chain: Playwright CLI (solution info extraction) + Python 3.8+ (helper scripts) + Terraform 1.15.4+ (declarative deployment). No KooCLI — all resource operations through Terraform.

Prerequisites

  • Python 3.8+, Playwright CLI, Terraform 1.15.4+ — see CLI Installation Guide
  • Huawei Cloud AK/SK via environment variables (HW_ACCESS_KEY, HW_SECRET_KEY); if not set, prompt user to manually edit terraform.auto.tfvars.json to fill in AK/SK
  • IAM user with sufficient permissions or rf_admin_trust agency — see IAM Policies

Security

  • 🚫 Never expose AK/SK in conversation or output
  • 🚫 Never ask user to type AK/SK in chat
  • ✅ Prefer IAM users over primary account
  • ✅ Modification ops (apply, destroy) require explicit user confirmation

Core Commands

Placeholder values (see Parameters for per-OS resolution):

PlaceholderLinux / macOSWindows
<python>python3python
<script_dir>./scripts./scripts
<temp_dir>/tmp$env:TEMP
# 1. Extract solution info
<python> <script_dir>/extract_sac_deploy_info.py \
  --url "https://www.huaweicloud.com/solution/implementations/quickly-build-a-yolo-training-platform.html" \
  --out <temp_dir>/sac_selected.json

# 2. Download and normalize template
<python> <script_dir>/download_tf_template_file.py \
  --url "https://documentation-samples.obs.cn-north-4.myhuaweicloud.com/solution-as-code-publicbucket/solution-as-code-moudle/quickly-build-a-yolo-training-platform/quickly-build-a-yolo-training-platform.tf" \
  --out-dir <temp_dir>/yolo-workdir

<python> <script_dir>/normalize_tf_providers.py <temp_dir>/yolo-workdir \
  --region "cn-north-4"

# 3. List variables for review
<python> <script_dir>/list_tf_variables.py <temp_dir>/yolo-workdir

# 4. Deploy
terraform init
terraform plan
# ⛔ STOP — Review the plan output above. Do NOT auto-apply.
# Confirm with the user (AskUserQuestion or equivalent) before proceeding.
# Only after explicit user confirmation:
terraform apply

# 5. Add YOLO UI security group rule
# Prompt user to manually add an ingress rule for TCP port 8001
# via Huawei Cloud console (VPC > Security Groups > Add Rule).
# Use restricted CIDR — do NOT open to all addresses.
# Wait for user confirmation before continuing.

# 6. Verify
terraform state list
terraform output -json

# 7. Cleanup
terraform destroy

Workflow

1. Extract solution info

<python> <script_dir>/extract_sac_deploy_info.py \
  --url "<solution_detail_page_url>" \
  --out <temp_dir>/sac_selected.json

After extraction, display the results to the user:

  • Solution name: title field from output JSON
  • Estimated price: estimated_price_text field
  • Deploy links: list each text and url from deploy_links array
  • If title or estimated_price_text is empty, warn the user and suggest manual verification on the solution page

2. Download and normalize template

<python> <script_dir>/download_tf_template_file.py \
  --url "<tf_template_url>" \
  --out-dir <temp_dir>/yolo-workdir

<python> <script_dir>/normalize_tf_providers.py <temp_dir>/yolo-workdir \
  --region "cn-north-4"

normalize_tf_providers.py writes terraform.auto.tfvars.json (including region and other parameters). If environment variables HW_ACCESS_KEY/HW_SECRET_KEY are not set, AK/SK fields are left empty. Prompt the user to manually edit the file to fill in AK/SK, then continue to the next step.

3. Confirm variables

<python> <script_dir>/list_tf_variables.py <temp_dir>/yolo-workdir

Review with user. Block apply if sensitive variables are empty/weak.

4. Deploy

STOP — Before running terraform apply, review the terraform plan output and confirm with the user (AskUserQuestion or equivalent). Do NOT auto-apply. Only proceed after explicit user confirmation.

5. Add YOLO UI security group rule

The Terraform template does not include an ingress rule for TCP port 8001, which is required for the YOLO training platform web UI. After deployment, prompt the user to manually add an ingress rule for TCP port 8001 via Huawei Cloud console (VPC > Security Groups > Add Rule). Use your own IP or a restricted CIDR — do NOT open to all addresses.

6. Verify

See Verification Method and Acceptance Criteria.

7. Cleanup

Parameters

ParameterRequiredDefaultConstraint
regionYescn-north-4Only supported region
AK/SKYesEnv vars HW_ACCESS_KEY/HW_SECRET_KEY; if absent, prompt user to edit tfvars.json
ecs_passwordYes8-26 chars, mixed case + digit + special
ecs_flavorNop2s.2xlarge.8
system_disk_sizeNo10040-1024 GB
data_disk_sizeNo50040-1024 GB
bandwidth_sizeNo3001-300 Mbit/s
charging_unitNomonthmonth or year
charging_periodNo1

Post-Deploy Output

  • terraform output -json — includes access_instructions with YOLO platform URL
  • YOLO UI: http://<EIP>:8001 (allow ~10 min for cloud-init)
  • Verify: ssh root@<EIP> "docker ps" and ssh root@<EIP> "nvidia-smi"

Output Format

terraform output -json returns JSON with the following key fields:

{
  "access_instructions": { "value": "http://<EIP>:8001" },
  "ecs_eip":             { "value": "<Elastic IP>" },
  "ecs_id":              { "value": "<ECS Instance ID>" },
  "vpc_id":              { "value": "<VPC ID>" }
}

All script outputs are in JSON format: extract_sac_deploy_info.py outputs solution info JSON, list_tf_variables.py outputs variable list JSON.

Verification

Verify deployment results step by step:

  1. Template extraction — Check <temp_dir>/sac_selected.json contains solution_name, price fields
  2. Template download — Confirm .tf files exist under <temp_dir>/yolo-workdir and terraform validate passes
  3. Variable confirmation — Sensitive variables (AK/SK, password) are not empty in list_tf_variables.py output
  4. Deploymentterraform plan shows no errors; user confirmed deployment; after apply, terraform state list shows all expected resources
  5. Service reachability — Wait 10-15 min for cloud-init, then curl -s http://<EIP>:8001 returns 200
  6. GPUssh root@<EIP> "nvidia-smi" shows GPU device, ssh root@<EIP> "docker ps" shows YOLO container running

See Verification Method and Acceptance Criteria for details.

Best Practices

  • Always terraform plan before apply
  • Start with charging_unit=month; switch to year after validation
  • Allow 10-15 min post-deploy for cloud-init
  • Monitor GPU via nvidia-smi; adjust ecs_flavor if underutilized

Reference Documents

DocumentDescription
CLI Installation GuideInstall Python, Playwright CLI, Terraform
IAM PoliciesPermissions, agency setup, failure handling
Verification MethodStep-by-step verification per workflow step
Acceptance CriteriaFull deployment acceptance checklist
Related CommandsTerraform, scripts, remote access reference

Notes

  • Only cn-north-4 region supported
  • terraform.auto.tfvars.json is sensitive — never commit to VCS
  • normalize_tf_providers.py writes region to tfvars; AK/SK left empty if env vars not set, user must fill manually
  • Tool chain: Playwright CLI + Python + Terraform — no KooCLI