Alibabacloud Dns Resolve Diagnose

Security

Alicloud DNS Diagnostic Skill. Diagnose DNS resolution failures, domain unreachable issues, NXDOMAIN errors, DNS record propagation delays, DNS hijacking, and other DNS-layer problems. Automatically performs WHOIS queries, recursive tracing, OpenAPI config verification, and nationwide probing via boce. Covers Alibaba Cloud DNS, GTM, PrivateZone, and third-party DNS. Triggers: "DNS resolution failed", "domain unreachable", "DNS not working", "NXDOMAIN", "domain ping failed", "DNS diagnose"

Install

openclaw skills install alibabacloud-dns-resolve-diagnose

Alibaba Cloud DNS Resolution Diagnostics (Customer Self-Service)

Scenario Description

Diagnose real-time DNS resolution anomalies covering the following products:

ProductTypical Issues
Public Authoritative DNS (Alibaba Cloud DNS)Resolution failure, incorrect results, changes not propagated, smart routing misconfiguration
Global Traffic Manager (GTM)GTM scheduling anomaly, CNAME not effective, address pool failover failure
PrivateZone (Internal DNS)Cannot resolve domain within VPC, zone records not effective
Third-party DNSDomain registered on Alibaba Cloud but DNS hosted elsewhere, ECS cannot resolve external domains

Architecture: dig + whois + Alibaba Cloud OpenAPI (Alidns/Domain/PrivateZone) + boce nationwide probing

Not applicable: Domain resolves but website unreachable, HTTPS certificate errors, slow page load, 5xx errors, domain registration/transfer, CDN/WAF issues.

Installation

Pre-check: Aliyun CLI >= 3.3.3 required

Run aliyun version to verify >= 3.3.3. If not installed or version too low, run curl -fsSL https://aliyuncli.alicdn.com/setup.sh | bash to install/update, or see references/cli-installation-guide.md for installation instructions.

Pre-check: Aliyun CLI plugin update required

[MUST] run aliyun configure set --auto-plugin-install true to enable automatic plugin installation. [MUST] run aliyun plugin update to ensure that any existing plugins are always up-to-date.

Additional tools:

ToolPurposeInstallation
python3 (>= 3.7)Run diagnostic scriptsmacOS: brew install python@3.11; Debian/Ubuntu: sudo apt install python3
digDNS queries and tracingmacOS: built-in; Debian/Ubuntu: sudo apt install dnsutils
whoisDomain registration infomacOS: built-in; Debian/Ubuntu: sudo apt install whois
playwrightNationwide probingpip install playwright && playwright install chromium

Authentication

Pre-check: Alibaba Cloud Credentials Required

Security Rules:

  • NEVER read, echo, or print AK/SK values
  • NEVER ask the user to input AK/SK directly in the conversation or command line
  • NEVER use aliyun configure set with literal credential values
  • ONLY use aliyun configure list to check credential status
aliyun configure list

Check the output for a valid profile (AK, STS, or OAuth identity).

If no valid profile exists, the skill will skip OpenAPI config check and rely on dig/whois/boce only.

RAM Policy

See references/ram-policies.md for the full permission list.

[MUST] Permission Failure Handling: When any command or API call fails due to permission errors at any point during execution, follow this process:

  1. Read references/ram-policies.md to get the full list of permissions required by this SKILL
  2. Use ram-permission-diagnose skill to guide the user through requesting the necessary permissions
  3. Pause and wait until the user confirms that the required permissions have been granted

Parameter Confirmation

IMPORTANT: Parameter Confirmation — Before executing any command or API call, ALL user-customizable parameters MUST be confirmed with the user. Do NOT assume or use default values without explicit user approval.

ParameterRequiredDescriptionDefault
DomainYesFull domain name (e.g., www.example.com)None
Issue descriptionYesBrief description of the problemNone
Record typeOptionalDNS record type to queryA

Core Workflow

Observability (MUST follow for every aliyun command)

Upon loading this skill, generate a random session ID (32-char lowercase hex string) and store it for the entire session. Example:

SESSION_ID=$(cat /dev/urandom | LC_ALL=C tr -dc 'a-f0-9' | head -c 32)

Every aliyun business API command MUST include the --user-agent parameter using this template:

--user-agent AlibabaCloud-Agent-Skills/alibabacloud-dns-resolve-diagnose/{session-id}

Where {session-id} is replaced with the generated session ID value. For example:

aliyun alidns describe-domain-info --DomainName example.com --read-timeout 30 --connect-timeout 10 --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dns-resolve-diagnose/${SESSION_ID}

Note: Do NOT use --user-agent on system/tool commands (configure, plugin, help, version, etc.) — only on business API commands.

Core Principles

Mandatory constraint — MUST use scripts for diagnosis, manual troubleshooting is FORBIDDEN:

  1. NEVER use curl, ping, telnet, nc, nslookup, traceroute, wget to troubleshoot directly
  2. ALL diagnostic steps must be executed through the Python scripts provided by this skill
  3. If a script fails, fix the dependency and retry — do NOT fall back to raw commands
  4. Agent's role: execute scripts -> read analysis results -> format output for user

Python Version Detection

PYTHON=""
for cmd in python3.13 python3.12 python3.11 python3.10 python3.9 python3.8 python3; do
  if command -v $cmd &>/dev/null; then
    ver=$($cmd -c "import sys; print(sys.version_info >= (3,7))" 2>/dev/null)
    if [ "$ver" = "True" ]; then PYTHON=$cmd; break; fi
  fi
done
echo "Using Python: $PYTHON ($($PYTHON --version))"

Use $PYTHON instead of python3 in all subsequent commands.

Quick Mode (Recommended, 5 seconds)

$PYTHON scripts/dns_quick_check.py --domain <domain>

Suitable for: DNS resolution failure, quick confirmation. Not suitable for: nationwide probing, OpenAPI config comparison, complex issues (GTM/PrivateZone).

Full Mode (Detailed diagnosis, 30-60 seconds)

Execute steps in order. Output results to user immediately after each step.

Step 0: Quick Pre-check

mkdir -p /tmp/dns_diag_<domain>
$PYTHON scripts/dns_quick_check.py --domain <domain> > /tmp/dns_diag_<domain>/quick.json
$PYTHON scripts/dns_analyze.py quick --file /tmp/dns_diag_<domain>/quick.json

Product type determination: NS contains *.alidns.com / *.hichina.com -> Alibaba Cloud DNS; otherwise -> third-party DNS.

Step 1: Config Check (requires credentials)

Skip this step if credentials are not configured.

# Check if domain is in current account
$PYTHON scripts/dns_openapi.py check --domain <root_domain>
# Query DNS records
$PYTHON scripts/dns_openapi.py records --domain <zone> --rr <host_record>
# Query domain config details
$PYTHON scripts/dns_openapi.py domain-info --domain <zone>

Short-circuit rule: If Step 1 identifies a definitive root cause at config level (incorrect record value, missing record, suspended record), skip Step 2 and proceed to Step 3.

Step 2: Nationwide Probing

If not short-circuited by Step 1, this step MUST NOT be skipped.

$PYTHON scripts/dns_boce.py both --domain <domain> > /tmp/dns_diag_<domain>/both.json
$PYTHON scripts/dns_analyze.py probe \
    --dns /tmp/dns_diag_<domain>/dns_probe.json \
    --http /tmp/dns_diag_<domain>/http_probe.json

Step 3: Comprehensive Analysis and Report

$PYTHON scripts/dns_analyze.py all \
    --quick /tmp/dns_diag_<domain>/quick.json \
    --dns /tmp/dns_diag_<domain>/dns_probe.json \
    --http /tmp/dns_diag_<domain>/http_probe.json

Root cause priority: P0 Domain expired/locked -> P1 DNS provider/recursive interruption -> P2 ICP block/partial route failure/config anomaly.

Verification

See references/verification-method.md

Degradation Strategy

Unavailable ToolFallback
PlaywrightAttempt install first ($PYTHON -m pip install playwright && $PYTHON -m playwright install chromium); only if install fails, degrade to multi-server dig comparison
OpenAPISkip authoritative record query, compare dig results with probing results only
digUse nslookup as substitute (limited, no trace)
All unavailableWHOIS basic check only

Command Reference

See references/related-commands.md

Best Practices

  1. Quick mode first — resolve simple issues in 5 seconds without running full flow
  2. Immediate output per step — do not wait for all steps to complete
  3. Save data — all raw data saved to /tmp/dns_diag_<domain>/
  4. Trust analysis scripts — core conclusions from analysis scripts; agent only formats and supplements
  5. No manual troubleshooting — never use curl/ping/telnet for direct investigation

References

FileContent
references/ram-policies.mdRAM permission policies
references/api-reference.mdOpenAPI parameter reference
references/examples.mdDiagnostic case studies
references/related-commands.mdCLI command table
references/verification-method.mdVerification methods
references/cli-installation-guide.mdCLI installation guide
references/acceptance-criteria.mdAcceptance criteria

Notes

  • All operations are read-only queries; no configuration will be modified
  • OpenAPI calls require read permissions for the corresponding products
  • PrivateZone only works within VPC; public dig queries returning nothing is normal behavior