Install
openclaw skills install huawei-cloud-cce-daily-cluster-inspectorUse when performing periodic, low-risk health checks on Huawei Cloud CCE clusters — daily inspection, quick health check, heartbeat summary, or continuous operations report. This skill prioritizes quick checks first, escalating to deep diagnosis only when anomalies are found. Read-only only — no mutation actions. Trigger: "cluster inspection", "集群巡检", "daily inspection", "日常巡检", "health check", "健康检查", "cluster heartbeat", "集群心跳", "quick check", "快检", "CCE inspection", "CCE 巡检"
openclaw skills install huawei-cloud-cce-daily-cluster-inspectorThis skill performs periodic, low-risk CCE cluster health inspections. It follows a quick-check-first strategy: run lightweight checks first, and only escalate to deep diagnosis when anomalies are detected. This avoids running heavy diagnostic actions on every inspection cycle.
This skill is strictly read-only — it never performs mutation actions. When risks are found, it outputs recommendations and hands off to the appropriate remediation skill.
Architecture: python3 scripts/huawei-cloud.py dispatcher → Huawei Cloud Python SDK + Kubernetes client → cluster status, Events, metrics, AOM alarms
Related Skills:
huawei-cloud-cce-auto-remediation-runner — execute confirmed remediation actions (scale, drain, rollback, etc.)huawei-cloud-cce-root-cause-analyzer — cross-domain root cause analysishuawei-cloud-cce-pod-failure-diagnoser — deep Pod failure diagnosishuawei-cloud-cce-node-failure-diagnoser — deep Node failure diagnosishuawei-cloud-cce-network-failure-diagnoser — network connectivity, DNS, ELB diagnosishuawei-cloud-cce-alarm-correlation-engine — alarm correlation and deduplicationhuawei-cloud-cce-ops-report-generator — formal operations report generationDo NOT use for:
huawei-cloud-cce-auto-remediation-runnerhuawei-cloud-cce-capacity-trend-forecasterhuaweicloudsdkcce, huaweicloudsdkcore, kubernetes packagesecho to check credential valuesHUAWEI_AK, HUAWEI_SK, HUAWEI_REGIONexport HUAWEI_AK=<your-ak>
export HUAWEI_SK=<your-sk>
export HUAWEI_REGION=cn-north-4
| API Action | Permission | Purpose |
|---|---|---|
cce:cluster:get | Get cluster | View CCE cluster details |
cce:cluster:createCert | Create certificate | Obtain kubeconfig for kubectl access |
cce:node:list | List nodes | Query CCE cluster nodes |
aom:instance:list | List AOM instances | Discover AOM Prom instance for metrics |
aom:metricsData:get | Get metrics data | Query Pod/node CPU/memory metrics |
aom:alarm:get | Get alarms | Query AOM alarm history |
digraph inspection_flow {
rankdir=TB;
start [shape=doublecircle label="Start Inspection"];
quick [shape=box label="Quick Check\nhuawei_cce_quick_check or\nhuawei_cce_auto_inspection"];
healthy [shape=box label="Output Heartbeat\nSummary"];
anomaly [shape=diamond label="Anomalies\nFound?"];
deep [shape=box label="Deep Diagnosis\nor Parallel Inspection"];
classify [shape=box label="Classify by:\nPod/Node/Event/\nAOM/ELB/Resource"];
report [shape=box label="Generate Report\n(P0/P1/P2 risks)"];
handoff [shape=box label="Recommend\nRemediation Handoff"];
start -> quick;
quick -> anomaly;
anomaly -> healthy [label="No"];
anomaly -> deep [label="Yes"];
deep -> classify;
classify -> report;
report -> handoff;
}
Step-by-step:
huawei_cce_quick_check or huawei_cce_auto_inspection)huawei-cloud-cce-auto-remediation-runnerhuawei_export_inspection_reportSee references/workflow.md for the complete workflow reference.
All actions dispatched through scripts/huawei-cloud.py using skill action=exec.
| Action | Required Parameters | Description |
|---|---|---|
huawei_cce_quick_check | region, cluster_id | Lightweight cluster health summary |
huawei_cce_auto_inspection | region, cluster_id | Automated inspection with anomaly detection |
| Action | Required Parameters | Description |
|---|---|---|
huawei_cce_deep_diagnosis | region, cluster_id | In-depth cluster diagnosis |
huawei_cce_cluster_inspection_parallel | region, cluster_id | Parallel multi-domain inspection |
huawei_cce_cluster_inspection_subagent | region, cluster_id | Subagent-based distributed inspection |
huawei_pod_status_inspection | region, cluster_id | Pod health inspection |
huawei_node_status_inspection | region, cluster_id | Node health inspection |
huawei_node_resource_inspection | region, cluster_id | Node resource utilization inspection |
huawei_event_inspection | region, cluster_id | Kubernetes Event analysis |
huawei_aom_alarm_inspection | region, cluster_id | AOM alarm inspection |
huawei_elb_monitoring_inspection | region, cluster_id | ELB health monitoring inspection |
| Action | Required Parameters | Description |
|---|---|---|
huawei_aggregate_inspection_results | region, cluster_id | Aggregate results from parallel/subagent inspections |
huawei_export_inspection_report | region, cluster_id | Export formal inspection report |
| Parameter | Required | Description | Default |
|---|---|---|---|
region | Yes | Huawei Cloud region, e.g., cn-north-4 | HUAWEI_REGION |
cluster_id | Yes | CCE cluster ID | N/A |
namespace | No | Kubernetes namespace scope | All namespaces |
ak | No | Override AK | HUAWEI_AK |
sk | No | Override SK | HUAWEI_SK |
project_id | No | Project ID | Auto from IAM |
See references/output-schema.md for the complete JSON response structure.
Key output fields:
summary — daily inspection summary textstatus — HEALTHY, WARNING, or CRITICALcluster.region / cluster.cluster_id — cluster identificationchecks — list of check resultsrisks — classified risk items (P0/P1/P2)recommended_followups — handoff recommendations to other skillsreport_file — optional exported report pathThis skill operates under R1 read-only constraints:
huawei-cloud-cce-auto-remediation-runnerreferences/risk-rules.md for full risk boundary detailshuawei_cce_quick_check or huawei_cce_auto_inspection; avoid heavy checks on every cyclehuawei-cloud-cce-auto-remediation-runnernamespace to reduce noise when targeting specific workloadshuawei_cce_cluster_inspection_parallel, always call huawei_aggregate_inspection_results to consolidatehuawei_export_inspection_report when a persistent report is needed| Pitfall | Symptom | Quick Fix |
|---|---|---|
| Running deep diagnosis every cycle | Slow inspection, wasted resources | Start with quick check; escalate only on anomaly |
| Attempting remediation directly | Skill scope violation | Hand off to huawei-cloud-cce-auto-remediation-runner |
| Missing cluster_id | Action fails immediately | Provide cluster_id from huawei_get_cce_clusters |
| No AOM Prom instance | Metrics return empty | Verify AOM instance exists; check aom:instance:list permission |
| Not aggregating parallel results | Incomplete or fragmented report | Call huawei_aggregate_inspection_results after parallel inspection |
| Exposing credentials in report | Security violation | Reports auto-sanitize; never manually include AK/SK or kubeconfig |
| Document | Description |
|---|---|
| Workflow | Quick-check-first escalation workflow and classification |
| Risk Rules | R1 read-only boundaries and prohibited actions |
| Output Schema | JSON response format for inspection results |
scripts/huawei-cloud.py via skill action=exec; do not run scripts directly in shellhuawei-cloud-cce-auto-remediation-runner with a confirmation checklisthuawei-cloud-cce-pod-failure-diagnoser, huawei-cloud-cce-node-failure-diagnoser, etc.)