Install
openclaw skills install @sdk-team/alibabacloud-network-health-inspectionComprehensive health inspection tool for Alibaba Cloud network products (EIP, CBWP, NAT Gateway, CEN, Transit Router, Physical Connection, VBR, Global Accelerator, CLB, ALB, NLB). Analyzes bandwidth utilization, connection count, QPS, and packet loss using Cloud Monitor data; generates a Markdown inspection report with monitoring charts, risk assessment, and scaling recommendations. All API calls are read-only. Use when users want to inspect network product usage, check bandwidth headroom, assess capacity before a business launch, or identify over-limit risk. Triggered by: inspect network products, network health check, bandwidth inspection, EIP/NAT/CLB/ALB/NLB/CEN/VBR/GA inspection, network utilization analysis, pre-launch network inspection, capacity assessment, bandwidth over-limit risk.
openclaw skills install @sdk-team/alibabacloud-network-health-inspectionaliyun configure). If not installed, use brew install aliyun-cli (macOS) or download from GitHub Releasespip3 install matplotlibAll API calls in this Skill are read-only queries and do not involve any resource creation, modification, or deletion. The following permissions are required:
| Cloud Product | API Permission | Purpose |
|---|---|---|
| Cloud Monitor (CMS) | cms:DescribeMetricList | Query monitoring metric data for all network products (core permission) |
| VPC | vpc:DescribeEipAddresses | Query EIP list |
| VPC | vpc:DescribeCommonBandwidthPackages | Query Common Bandwidth Package list |
| VPC | vpc:DescribeNatGateways | Query NAT Gateway list |
| VPC | vpc:DescribePhysicalConnections | Query Physical Connection list |
| VPC | vpc:DescribeVirtualBorderRouters | Query VBR list |
| Cloud Enterprise Network (CEN) | cbn:DescribeCens | Query CEN list |
| Cloud Enterprise Network (CEN) | cbn:DescribeCenBandwidthPackages | Query CEN bandwidth packages |
| Cloud Enterprise Network (CEN) | cbn:ListTransitRouters | Query Transit Router list |
| Cloud Enterprise Network (CEN) | cbn:ListTransitRouterVpcAttachments | Query TR VPC attachments |
| Cloud Enterprise Network (CEN) | cbn:ListTransitRouterVbrAttachments | Query TR VBR attachments |
| Cloud Enterprise Network (CEN) | cbn:ListTransitRouterRouteTables | Query TR route tables |
| Global Accelerator (GA) | ga:ListAccelerators | Query Global Accelerator instance list |
| CLB | slb:DescribeLoadBalancers | Query CLB list |
| ALB | alb:ListLoadBalancers | Query ALB list |
| NLB | nlb:ListLoadBalancers | Query NLB list |
For the recommended RAM policy JSON and detailed security notes, see references/ram-policies.md.
If a product lacks sufficient API permissions, the Skill will skip that product and mark it as an error in the report, without affecting the inspection of other products.
This Skill does not configure a PreToolUse Hook for the following reasons:
Describe* / List* read-only queries, not involving any resource creation, modification, or deletionallowed-tools only declares Bash Read; the Skill cannot invoke Write, Edit, or other file-writing toolsaliyun CLI read-only query commands and python3 inspection scripts, never for write operationsTherefore, this Skill has no operations that need interception or secondary confirmation, and no PreToolUse Hook is required. If future versions add write-operation capabilities, corresponding PreToolUse Hook interception mechanisms must be added accordingly.
The user needs to provide the following information (extracted from the user's natural language):
cn-hangzhou; supports multiple regions separated by commas, e.g., cn-hangzhou,cn-shanghai--days 7--days 1--days 360 — 1-minute granularity, finest data resolution, suitable for short-term troubleshooting300 — 5-minute granularity, balance between resolution and data volume900 — 15-minute granularity, less data, suitable for long-period overviewExtract product keywords from user input and map them to the corresponding inspection products. The same product may have multiple expressions; all must be recognized:
| Product ID | Possible User Expressions |
|---|---|
| EIP | EIP, Elastic IP, Elastic Public IP, Public IP |
| CBWP | CBWP, Common Bandwidth Package, Shared Bandwidth, Bandwidth Package |
| NAT | NAT, NAT Gateway, NAT Gateway |
| CEN | CEN, Cloud Enterprise Network |
| TR | TR, Transit Router |
| PhysConn | Physical Connection, Express Connect, Dedicated Line |
| VBR | VBR, Virtual Border Router, Border Router |
| GA | GA, Global Accelerator |
| CLB | CLB, Classic Load Balancer, SLB (Note: when users say "SLB" they usually mean CLB) |
| ALB | ALB, Application Load Balancer |
| NLB | NLB, Network Load Balancer |
Special case handling:
User does not specify products: Inspect all 11 network product categories (default behavior)
User specifies products: Only inspect the products mentioned by the user; skip products not mentioned
User mentions unsupported products: Explicitly inform the user that the product is not currently supported for inspection by this Skill. For example:
This Skill does not currently support inspecting "VPN Gateway". The 11 supported network product categories are: EIP, Common Bandwidth Package, NAT Gateway, Cloud Enterprise Network, Transit Router, Physical Connection, VBR, Global Accelerator, CLB, ALB, NLB.
After informing, continue inspecting other supported products mentioned by the user (if any).
Before starting the inspection, you must confirm that the user has provided both the time range and data aggregation period parameters. If the user has not specified them, you must ask:
When time range is not specified, ask:
How long of data would you like to inspect? For example: last 1 day, last 3 days, last 7 days, etc.
When data aggregation period is not specified, ask and clarify the meaning:
What aggregation period would you like for the monitoring data? The aggregation period determines the time interval between monitoring data points:
- 60 seconds (1 minute): One data point per minute, finest resolution, suitable for short time range troubleshooting
- 300 seconds (5 minutes): One data point every 5 minutes, good balance between resolution and data volume
- 900 seconds (15 minutes): One data point every 15 minutes, less data volume, suitable for long-period overall trend overview
Note: The smaller the aggregation period, the more data points need to be fetched, and the inspection time will increase significantly. For example, inspecting 7 days of data with 60-second granularity requires approximately 10,080 data points per metric, while 900-second granularity only requires about 672 data points per metric.
After both parameters are confirmed, proceed with the inspection workflow.
| # | Product | Script | Cloud Monitor Namespace | Key Metrics |
|---|---|---|---|---|
| 1 | Elastic IP (EIP) | inspect_eip.py | acs_vpc_eip | Outbound/inbound bandwidth, rate-limit packet loss |
| 2 | Common Bandwidth Package (CBWP) | inspect_cbwp.py | acs_bandwidth_package | Outbound/inbound bandwidth, utilization, rate-limit packet loss |
| 3 | NAT Gateway | inspect_nat.py | acs_nat_gateway | Outbound/inbound bandwidth, SNAT connection count, connection packet loss |
| 4 | Cloud Enterprise Network (CEN) | inspect_cen.py | acs_cen | Cross-region outbound/inbound bandwidth, bandwidth package capacity |
| 5 | Transit Router (TR) | inspect_tr.py | - | TR connection status, VPC/VBR attachment count |
| 6 | Physical Connection | inspect_physconn.py | - | Connection status (for traffic, see VBR) |
| 7 | VBR | inspect_vbr.py | acs_physical_connection | Outbound/inbound bandwidth, health check latency/packet loss |
| 8 | Global Accelerator (GA) | inspect_ga.py | acs_global_acceleration | Outbound/inbound bandwidth, packet rate |
| 9 | Classic Load Balancer (CLB) | inspect_clb.py | acs_slb_dashboard | Traffic, connection count, QPS, packet loss |
| 10 | Application Load Balancer (ALB) | inspect_alb.py | acs_alb | QPS, HTTP status codes, connection count |
| 11 | Network Load Balancer (NLB) | inspect_nlb.py | acs_nlb | Bandwidth, active connections, new connections |
cn-hangzhou as the CMS endpoint region; monitoring data ownership is determined by the instanceId in Dimensions, independent of the endpoint regionacs_physical_connection, not acs_express_connectbwp_tx_rate/bwp_rx_rate, not net_tx.rate/net_rx.rateSessionLimitDropRate, not SessionLimitDropConnectionVbrHealthyCheckLatency unit is microseconds (us), needs to be divided by 1000 to convert to milliseconds (ms)VbrHealthyCheckLossRate return value is already a percentage, no need to multiply by 100loadBalancerId, not instanceIdAll API calls in this Skill are read-only queries and do not involve any resource creation, modification, or deletion.
List of read-only APIs used:
cms:DescribeMetricList - Query Cloud Monitor datavpc:DescribeEipAddresses - Query EIP listvpc:DescribeCommonBandwidthPackages - Query Common Bandwidth Package listvpc:DescribeNatGateways - Query NAT Gateway listvpc:DescribePhysicalConnections - Query Physical Connection listvpc:DescribeVirtualBorderRouters - Query VBR listcbn:DescribeCens - Query CEN listcbn:DescribeCenBandwidthPackages - Query CEN bandwidth packagescbn:ListTransitRouters - Query Transit Router listcbn:ListTransitRouterVpcAttachments - Query TR VPC attachmentscbn:ListTransitRouterVbrAttachments - Query TR VBR attachmentscbn:ListTransitRouterRouteTables - Query TR route tablesga:ListAccelerators - Query Global Accelerator instance listslb:DescribeLoadBalancers - Query CLB listalb:ListLoadBalancers - Query ALB listnlb:ListLoadBalancers - Query NLB listAll aliyun CLI calls in this Skill are made through Python inspection scripts. Before executing any script, configure AI-Mode globally so every CLI invocation carries the correct User-Agent header.
# Enable AI-Mode (run once before any aliyun CLI usage)
aliyun configure ai-mode enable
# Set User-Agent to identify this Skill
aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-network-health-inspection"
# Update plugins to ensure latest versions
aliyun plugin update 2>/dev/null || true
# Disable AI-Mode after all CLI calls are complete
aliyun configure ai-mode disable
# Check aliyun CLI version
aliyun version 2>&1 | head -1
# Enable AI-Mode and set User-Agent (must run at the start of each Skill execution)
aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-network-health-inspection"
# Update aliyun CLI plugins (ensure latest plugin versions)
aliyun plugin update 2>/dev/null || true
# Check Python and matplotlib
python3 -c "import matplotlib; print('matplotlib', matplotlib.__version__)" 2>&1
# If matplotlib is not installed, install it automatically
pip3 install matplotlib
AI-Mode explanation: Enabling AI-Mode causes aliyun CLI to include a User-Agent header in requests, allowing the Alibaba Cloud platform to identify read-only queries initiated by AI Agents. After the Skill execution completes, you must run
aliyun configure ai-mode disableto disable AI-Mode and restore the default state.
INSPECT_DIR=$(mktemp -d /tmp/network_inspect_XXXXXX)
CHARTS_DIR="$INSPECT_DIR/charts"
mkdir -p "$CHARTS_DIR"
echo "Inspection data directory: $INSPECT_DIR"
Only run the inspection scripts corresponding to the products specified by the user. If the user has not specified a product scope, run all 11 scripts. Run each script sequentially and save the results as JSON files:
SCRIPTS_DIR="<skill_path>/scripts"
REGIONS="cn-hangzhou" # Based on user-specified region
DAYS=7 # Based on user-specified time range
PERIOD=300 # Based on user-specified aggregation period (seconds)
# The following scripts are executed based on the user's product selection; unselected products are skipped
# 1. EIP inspection (Product ID: EIP)
python3 "$SCRIPTS_DIR/inspect_eip.py" --regions "$REGIONS" --days $DAYS --period $PERIOD > "$INSPECT_DIR/eip.json"
# 2. CBWP inspection (Product ID: CBWP)
python3 "$SCRIPTS_DIR/inspect_cbwp.py" --regions "$REGIONS" --days $DAYS --period $PERIOD > "$INSPECT_DIR/cbwp.json"
# 3. NAT Gateway inspection (Product ID: NAT)
python3 "$SCRIPTS_DIR/inspect_nat.py" --regions "$REGIONS" --days $DAYS --period $PERIOD > "$INSPECT_DIR/nat.json"
# 4. CEN inspection (Product ID: CEN, global resource, no Region needed)
python3 "$SCRIPTS_DIR/inspect_cen.py" --days $DAYS --period $PERIOD > "$INSPECT_DIR/cen.json"
# 5. Transit Router inspection (Product ID: TR, no Cloud Monitor metrics, no --period needed)
python3 "$SCRIPTS_DIR/inspect_tr.py" --regions "$REGIONS" --days $DAYS > "$INSPECT_DIR/tr.json"
# 6. Physical Connection inspection (Product ID: PhysConn, no Cloud Monitor metrics, no --period needed)
python3 "$SCRIPTS_DIR/inspect_physconn.py" --regions "$REGIONS" --days $DAYS > "$INSPECT_DIR/physconn.json"
# 7. VBR inspection (Product ID: VBR)
python3 "$SCRIPTS_DIR/inspect_vbr.py" --regions "$REGIONS" --days $DAYS --period $PERIOD > "$INSPECT_DIR/vbr.json"
# 8. Global Accelerator inspection (Product ID: GA, global resource)
python3 "$SCRIPTS_DIR/inspect_ga.py" --days $DAYS --period $PERIOD > "$INSPECT_DIR/ga.json"
# 9. CLB inspection (Product ID: CLB)
python3 "$SCRIPTS_DIR/inspect_clb.py" --regions "$REGIONS" --days $DAYS --period $PERIOD > "$INSPECT_DIR/clb.json"
# 10. ALB inspection (Product ID: ALB)
python3 "$SCRIPTS_DIR/inspect_alb.py" --regions "$REGIONS" --days $DAYS --period $PERIOD > "$INSPECT_DIR/alb.json"
# 11. NLB inspection (Product ID: NLB)
python3 "$SCRIPTS_DIR/inspect_nlb.py" --regions "$REGIONS" --days $DAYS --period $PERIOD > "$INSPECT_DIR/nlb.json"
On-demand execution rule: Only run the scripts for products selected by the user. For example, if the user only requests inspection of "EIP and NAT Gateway", only execute scripts 1 and 3; skip the other 9 scripts. Products that are not executed will not generate JSON files, and subsequent steps 4 (chart generation) and 5 (report generation) will automatically ignore non-existent product data.
Note: Generate charts before generating the report; the report will automatically embed chart references.
# Generate charts for instances with monitoring data
python3 "$SCRIPTS_DIR/inspect_charts.py" \
--dir "$INSPECT_DIR" \
--days $DAYS \
--period $PERIOD \
--output-dir "$CHARTS_DIR"
# Generate Markdown inspection report; --charts-dir parameter embeds chart references in the report
python3 "$SCRIPTS_DIR/inspect_report.py" \
--dir "$INSPECT_DIR" \
--days $DAYS \
--charts-dir "$CHARTS_DIR" \
--output "$INSPECT_DIR/report.md"
If output to DingTalk Docs is needed, generate the local version report first for display; the DingTalk version report is generated separately in Step 7 (using the
--image-url-mapparameter to replace local paths with cloud URLs).
$INSPECT_DIR/report.mdImportant: Do not abbreviate, truncate, or summarize the report content in any way. Every word, every table, and every in-depth analysis section of the report must be displayed in full. Note: Monitoring charts only need to be displayed in the final report (local report or DingTalk document); do not read individual PNG files and display them to the user in the conversation.
# Disable AI-Mode after inspection completes, restore default state
aliyun configure ai-mode disable
If the user requests outputting the report to DingTalk Docs, use the DingTalk Docs MCP service.
Core principle: The report must be written to DingTalk Docs in its entirety; no abbreviation, omission, or truncation of any content is allowed. In-depth analysis, chart references, all tables, and all text must be included without exception. If the content is long, it must be written in multiple segments; reducing content to shorten length is strictly prohibited.
Prerequisite: Check if DingTalk Docs MCP is installed
Check whether the current environment has DingTalk Docs MCP tools available (tool names containing dingtalk with document operation capabilities such as create_document, update_document, get_doc_attachment_upload_info, etc.).
If the DingTalk Docs MCP is not installed, guide the user through the following installation steps:
DINGTALK_MCP_DOCS_URL is correctly configuredComplete workflow for publishing the report using DingTalk Docs MCP:
7.1 Create a DingTalk Document
Use the DingTalk Docs MCP create_document to create a document in the target folder, with the document name "Alibaba Cloud Network Product Inspection Report YYYY-MM-DD". Record the returned document dentryUuid (used for all subsequent operations).
7.2 Batch parallel upload of charts to DingTalk Docs (obtain image URLs)
Efficiently and in parallel upload all PNG chart files under $CHARTS_DIR to the DingTalk document:
Phase 1: Batch obtain upload credentials
Call get_doc_attachment_upload_info for all PNG files at once, with parameters:
dentryUuid: The document ID returned in Step 7.1fileName: Image file name (e.g., eip_eip-xxx_bandwidth.png)fileSize: File size in bytesmediaType: image/pngPerformance critical: Issue all
get_doc_attachment_upload_infoMCP calls in a single message (do not call them one by one waiting for each return before calling the next). DingTalk MCP supports concurrent calls, so all credentials can be returned in parallel.
Phase 2: Parallel curl upload
Put all curl -X PUT commands into a single Bash call for parallel execution:
# All curls execute in parallel, wait at the end for all to complete
curl -s -X PUT "<uploadUrl_1>" -H "Content-Type: image/png" --data-binary @"$CHARTS_DIR/file1.png" &
curl -s -X PUT "<uploadUrl_2>" -H "Content-Type: image/png" --data-binary @"$CHARTS_DIR/file2.png" &
curl -s -X PUT "<uploadUrl_3>" -H "Content-Type: image/png" --data-binary @"$CHARTS_DIR/file3.png" &
# ... all images ...
wait
echo "All uploads done"
Note: If there are more than 20 images, split into 2-3 parallel batches (10-15 per batch), with parallelism within each batch and sequential execution between batches, to avoid too many connections.
Phase 3: Build the mapping
Collect all resourceUrl values returned by get_doc_attachment_upload_info, build an image_url_map ({filename: resourceUrl} mapping), and write it to $INSPECT_DIR/image_url_map.json.
Example image_url_map.json:
{
"eip_eip-xxx_bandwidth.png": "<resourceUrl>",
"nat_ngw-xxx_snat.png": "<resourceUrl>"
}
7.3 Generate DingTalk version report (using cloud image URLs)
Regenerate the report using the --image-url-map parameter, replacing image references with DingTalk OSS resourceUrls:
python3 "$SCRIPTS_DIR/inspect_report.py" \
--dir "$INSPECT_DIR" \
--days $DAYS \
--charts-dir "$CHARTS_DIR" \
--image-url-map "$INSPECT_DIR/image_url_map.json" \
--output "$INSPECT_DIR/report_dingtalk.md"
All
image paths in the generatedreport_dingtalk.mdhave been replaced with DingTalk OSS resourceUrls, ensuring proper rendering in DingTalk Docs.
7.4 Write the complete report to DingTalk Docs in segments
Report content is typically very long (detailed analysis of 11 products + in-depth analysis + charts), so it must be written in segments.
Optimal write strategy (tested and verified; follow this directly without adjustment):
$INSPECT_DIR/report_dingtalk.md## second-level headings into multiple segments: Use lines starting with ## as delimiters to split the report into multiple chunks. The first chunk includes the report title header (from # Alibaba Cloud Network Product Comprehensive Inspection Report to before the first ## ). Special attention: The ## Product Inspection Details section contains the complete content for 11 products, which is much larger than other sections, and must be further split by ### third-level headings into one independent chunk per product.### segment includes "detail table + in-depth analysis + monitoring chart references", which form an inseparable whole.  image references are just Markdown text (approximately 100 characters each), not binary images, and will not cause timeouts. Do not separate image references from the product segment and write them independently.#### Monitoring Charts boundary into two chunks:
#### Monitoring Charts and all chart references below itupdate_document, mode: overwrite (overwrite)update_document, mode: append (append)update_document calls; target is 8-12 calls (not 20+), as each call has network overheadNote: A single
update_documentcall should not be too large, otherwise it may trigger a DingTalk HSFTimeOutException (3000ms timeout). A single chunk is recommended to be no more than 10000 characters. Tested: 10000 characters or less will not trigger a timeout.
Write example (typical report, note total calls significantly reduced after merging):
| Write Batch | Content | mode | Estimated Characters |
|---|---|---|---|
| Batch 1 | Report title + Executive summary + Inspection overview + Instances requiring attention + Traffic insights | overwrite | ~5000-8000 |
| Batch 2 | Product 1 (EIP) details + In-depth analysis (split out charts if including them exceeds 10000 chars) | append | ~6000-9000 |
| Batch 3 | Product 1 (EIP) monitoring charts (only if previous step was split) | append | ~5000-8000 |
| Batch 4 | Product 2 (CBWP) + Product 3 (NAT) complete content (including chart references, merged write) | append | ~6000-9000 |
| Batch 5 | Product 4 (CEN) + Product 5 (TR) + Product 6 (Physical Connection) + Product 7 (VBR) (merged write) | append | ~5000-8000 |
| Batch 6 | Product 8 (GA) + Product 9 (CLB) complete content (merged write) | append | ~5000-8000 |
| Batch 7 | Product 10 (ALB) + Product 11 (NLB) complete content (merged write) | append | ~5000-8000 |
| Batch 8 | Scaling recommendations + Capacity planning + Appendix (risk descriptions + specification references + methodology notes + disclaimer) | append | ~4000-6000 |
Key rules (strictly follow):
- Chart references for each product must be written immediately after that product's text content; never concentrate all products' charts at the end of the report
- Report format reference: Under
#### Monitoring Chartsfor each product, charts are listed per instance (**Instance ID (Name) — Metric:**+); instances without monitoring data will have a text description- All content must be written in full; no section, table, in-depth analysis, or image reference may be omitted
- Products without instances ("No XXX instances found") can be merged with adjacent empty products for writing
- Speed optimization core: Reduce
update_documentcalls from 20+ to 8-12 through aggressive merging
7.5 Return the Document Link
Return the DingTalk document link to the user, confirming that the report has been written in full (including all text content and charts).
| Level | Mark | Criteria |
|---|---|---|
| Critical | [!!!] | Utilization >=90%, packet loss present, VBR latency >100ms or packet loss >5%, abnormal status |
| Warning | [!] | Utilization >=70%, VBR latency >50ms or packet loss >1%, ALB 5XX error rate >1% |
| OK | [OK] | Utilization <70%, all metrics normal |
| Error | [ERR] | API error occurred during inspection |
brew install aliyun-cli or download from GitHubaliyun configure to configure access credentialsreferences/ram-policies.mdpip3 install matplotlib