dataworks-diagnoser

v1.0.0

Fetch and analyze Alibaba Cloud DataWorks task instance logs to diagnose failures and get actionable recommendations using your instance ID and credentials.

⭐ 0· 41·0 current·0 all-time

by@ljw-git-dw

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

Functionality (fetch DataWorks logs and analyze errors) matches the code: scripts call Alibaba Cloud DataWorks APIs and perform local log analysis. Asking for ALIBABA_CLOUD_ACCESS_KEY_ID / SECRET is appropriate for this purpose. However the registry metadata claims no required env vars or primary credential while SKILL.md and scripts clearly require credentials and Python SDKs — an inconsistency in declared requirements.

✓

Instruction Scope

Runtime instructions and scripts limit operations to: loading credentials (env, ~/.alibabacloud/credentials or ./credentials.json), calling DataWorks APIs, parsing logs, and printing/saving reports. The SKILL.md does not instruct unrelated file reads or external endpoints beyond Alibaba Cloud. The only notable behavior is the scripts will search for local credential files (including credentials.json in working dir), which is expected but worth noting.

ℹ

Install Mechanism

There is no platform-level install spec, and the code relies on standard pip packages (Alibaba Cloud SDKs) listed in requirements.txt — reasonable for the task. SKILL.md embeds a small install metadata snippet (suggesting brew curl) and README instructs pip installs. No downloads from arbitrary URLs or obfuscated installers were found. The minor inconsistency between 'no install spec' in registry metadata and install hints inside SKILL.md is worth correcting.

Credentials

The scripts legitimately need Alibaba Cloud AccessKey ID/Secret and network access to DataWorks endpoints. However the registry metadata omits these required environment variables and primary credential declaration. The skill also tries multiple locations for credentials (env, ~/.alibabacloud/credentials, ./credentials.json) — convenient but increases the chance of accidentally using unintended/local credentials. Ensure you supply a least-privilege RAM subaccount key and avoid leaving long-lived secrets in the working directory.

✓

Persistence & Privilege

The skill does not request permanent 'always' presence, does not modify other skills or system-wide config, and only writes files if the user instructs (save-log / save-report). It runs helper scripts as subprocesses locally; no autonomous elevation or hidden persistence was observed.

What to consider before installing

This package is mostly coherent: it fetches DataWorks logs and analyzes them and does require your Alibaba Cloud AccessKey (ALIBABA_CLOUD_ACCESS_KEY_ID and _SECRET). Before installing or running: 1) Confirm the skill's source/provenance (author repo or signed release); 2) Use a RAM subaccount with minimal permissions (only GetTaskInstanceLog / QueryTask) rather than root account keys; 3) Don't keep credentials in the project directory (credentials.json) unless you intend to; prefer environment variables or a secured config; 4) Run in an isolated environment (virtualenv/container) and review/verify the pip packages (alibabacloud_* and aliyun sdk) you will install; 5) Consider rotating/revoking the AccessKey after use and inspect network traffic if you need to ensure keys are only used against Alibaba endpoints. The main concrete issue is the registry metadata omission — ask the publisher to update metadata to declare the required env vars and dependencies before trusting automated installation.

Like a lobster shell, security has layers — review code before you run it.

latestvk97d1e7exmw7mwnna770kcdrtn84wa58

41downloads

0stars

1versions

Updated 4d ago

v1.0.0

MIT-0

DataWorks Task Instance Diagnostician

Fetches task instance logs from Alibaba Cloud DataWorks API and provides intelligent diagnostic recommendations.

Quick Start

Diagnose a failed task:

python3 scripts/dataworks_diagnose.py <instance_id>

Example:

python3 scripts/dataworks_diagnose.py 123456789

When to Use

✅ USE this skill when:

DataWorks task instance failed and you need to know why
You have an instance ID and need to fetch error logs
You want automated diagnosis and solutions for task failures
Troubleshooting ODPS SQL, Data Integration, Shell, Python nodes
Need to analyze error patterns across multiple failures
Preparing incident reports for failed tasks

When NOT to Use

❌ DON'T use this skill when:

You need real-time task monitoring (use DataWorks console)
You want to modify task configurations (use console or API directly)
You need historical analytics across many tasks (use DataWorks reports)
The task is still running (wait for completion first)
You don't have Alibaba Cloud credentials (need AccessKey)

Prerequisites

1. Alibaba Cloud Credentials

One of the following is required:

Option A: Environment Variables (Recommended)

export ALIBABA_CLOUD_ACCESS_KEY_ID=your_access_key
export ALIBABA_CLOUD_ACCESS_KEY_SECRET=your_access_secret

Option B: Config File Create ~/.alibabacloud/credentials:

{
  "access_key_id": "your_access_key",
  "access_key_secret": "your_access_secret"
}

Option C: Aliyun CLI Config If you have Aliyun CLI configured, credentials will be loaded automatically.

2. Required Permissions

The AccessKey needs these permissions:

dataworks:GetInstanceLog - Fetch task instance logs
dataworks:QueryTask - Query task information

3. Network Access

Access to Alibaba Cloud API endpoints
If using VPC, ensure proper network configuration

Core Workflows

1. Quick Diagnosis (Recommended)

Fetch log and get diagnosis in one command:

python3 scripts/dataworks_diagnose.py <instance_id>

Example:

python3 scripts/dataworks_diagnose.py 123456789

Output:

🔍 开始诊断 DataWorks 任务实例：123456789
📍 区域：cn-hangzhou
------------------------------------------------------------

📥 步骤 1/2: 获取任务日志...
✅ 日志获取成功

🔬 步骤 2/2: 分析诊断中...
✅ 诊断完成

============================================================
📋 诊断报告
============================================================
🔍 DataWorks 任务实例诊断报告
============================================================
实例 ID: 123456789
发现问题数：2

----------------------------------------------------------------------
🔴 问题 1: 资源配额不足
   类型：resource_quota
   严重程度：HIGH
   
   相关日志:
     > ERROR: quota exceeded for resource group 'default'
     > No available slots in queue
   
   建议解决方案:
     1. 检查当前资源组的使用情况，释放闲置资源
     2. 联系管理员提升资源配额
     3. 优化任务配置，减少资源消耗
     4. 考虑错峰调度，避开资源使用高峰
   
   参考文档：https://help.aliyun.com/.../resource-group.html

2. Fetch Log Only

python3 scripts/fetch_instance_log.py <instance_id> [options]

Options:

# Specify region
python3 scripts/fetch_instance_log.py 123456789 --region cn-shanghai

# Output as JSON
python3 scripts/fetch_instance_log.py 123456789 --json

# Show full log (default: last 50 lines)
python3 scripts/fetch_instance_log.py 123456789 --verbose

# Save to file
python3 scripts/fetch_instance_log.py 123456789 > log.txt

3. Diagnose Existing Log

python3 scripts/diagnose_log.py <log_file>

Examples:

# From file
python3 scripts/diagnose_log.py error.log

# From stdin
cat log.txt | python3 scripts/diagnose_log.py

# With instance ID
python3 scripts/diagnose_log.py error.log --instance-id 123456789

# JSON output
python3 scripts/diagnose_log.py error.log --json

# Summary only
python3 scripts/diagnose_log.py error.log --summary

Scripts

This skill includes three scripts:

`dataworks_diagnose.py` - All-in-One Tool

Fetches log and provides diagnosis automatically.

Usage:

python3 scripts/dataworks_diagnose.py <instance_id> [options]

Options:

--region, -r - Alibaba Cloud region (default: cn-hangzhou)
--json, -j - Output as JSON
--verbose, -v - Show full log
--save-log FILE - Save raw log to file
--save-report FILE - Save diagnostic report to file

`fetch_instance_log.py` - Log Fetcher

Fetches task instance log from DataWorks API.

Usage:

python3 scripts/fetch_instance_log.py <instance_id> [options]

Options:

--region, -r - Region (default: cn-hangzhou)
--access-key - Access Key ID
--access-secret - Access Key Secret
--json, -j - JSON output
--verbose, -v - Full log

`diagnose_log.py` - Log Analyzer

Analyzes log content and provides diagnostic recommendations.

Usage:

python3 scripts/diagnose_log.py <log_file_or_stdin> [options]

Options:

--instance-id - Task instance ID
--json, -j - JSON output
--summary, -s - Summary only

Detected Error Patterns

The diagnostician recognizes these error types:

Error Type	Severity	Examples
🔴 resource_quota	High	"quota exceeded", "资源不足"
🔴 resource_expired	High	"expired", "独享资源组已过期", "bill exception"
🔴 connection_timeout	High	"connection timeout", "network unreachable"
🔴 permission_denied	High	"permission denied", "access denied"
🟡 syntax_error	Medium	"syntax error", "parse error"
🟡 table_not_found	Medium	"table not found", "doesn't exist"
🟡 data_quality	Medium	"quality check failed"
🔴 memory_overflow	High	"out of memory", "heap space"
🔴 disk_full	High	"disk full", "no space left"
🟡 dependency_failed	Medium	"dependency failed", "upstream failed"
🟡 api_rate_limit	Medium	"rate limit exceeded"

See references/error_codes.md for detailed error patterns and solutions.

Common Regions

Region	Code
华东 1 (杭州)	cn-hangzhou
华东 2 (上海)	cn-shanghai
华北 1 (青岛)	cn-qingdao
华北 2 (北京)	cn-beijing
华南 1 (深圳)	cn-shenzhen
香港	cn-hongkong
新加坡	ap-southeast-1

API Reference

API: GetTaskInstanceLog
Version: 2024-05-18
Endpoint: https://dataworks-public.{region}.aliyuncs.com/

Request Parameters:

InstanceId (required) - Task instance ID
RegionId (required) - Region ID

Response:

{
  "Data": {
    "LogContent": "...",
    "InstanceStatus": "FAILED",
    "CycleTime": "2024-01-15 10:30:00"
  },
  "Code": "200"
}

Documentation: https://api.aliyun.com/api/dataworks-public/2024-05-18/GetTaskInstanceLog

Examples

Example 1: Quick Diagnosis

python3 scripts/dataworks_diagnose.py 123456789

Example 2: Save Report

python3 scripts/dataworks_diagnose.py 123456789 --save-report diagnosis.txt

Example 3: Different Region

python3 scripts/dataworks_diagnose.py 123456789 --region cn-shanghai

Example 4: Analyze Saved Log

python3 scripts/diagnose_log.py saved_log.txt --instance-id 123456789

Example 5: Batch Analysis

for id in 123 456 789; do
  python3 scripts/diagnose_log.py --instance-id $id < log_$id.txt
done

Troubleshooting

"Credentials not found"

# Set environment variables
export ALIBABA_CLOUD_ACCESS_KEY_ID=your_key
export ALIBABA_CLOUD_ACCESS_KEY_SECRET=your_secret

"Instance not found"

Verify the instance ID is correct
Check if the instance exists in DataWorks console
Ensure you're using the correct region

"Permission denied"

Verify AccessKey has required permissions
Check RAM role configuration
Contact administrator for access

"Request timeout"

Check network connectivity
Try increasing timeout in script
Verify API endpoint is accessible

Tips

💡 Pro tips:

Save logs for failed tasks - Use --save-log to keep records
Generate reports - Use --save-report for documentation
Batch processing - Script supports multiple instance IDs
JSON output - Use --json for programmatic processing
Region matters - Always use the correct region for your workspace

Security

⚠️ Important:

Never commit AccessKeys to version control
Use RAM roles instead of main account keys
Rotate keys regularly
Use environment variables or secure config files
Restrict key permissions to minimum required

References

references/error_codes.md - Complete error code reference
DataWorks Documentation
API Reference
Error Codes

Comments

Loading comments...

dataworks-diagnoser

DataWorks Task Instance Diagnostician

Quick Start

When to Use

When NOT to Use

Prerequisites

1. Alibaba Cloud Credentials

2. Required Permissions

3. Network Access

Core Workflows

1. Quick Diagnosis (Recommended)

2. Fetch Log Only

3. Diagnose Existing Log

Scripts

dataworks_diagnose.py - All-in-One Tool

fetch_instance_log.py - Log Fetcher

diagnose_log.py - Log Analyzer

Detected Error Patterns

Common Regions

API Reference

Examples

Example 1: Quick Diagnosis

Example 2: Save Report

Example 3: Different Region

Example 4: Analyze Saved Log

Example 5: Batch Analysis

Troubleshooting

"Credentials not found"

"Instance not found"

"Permission denied"

"Request timeout"

Tips

Security

References

Comments

`dataworks_diagnose.py` - All-in-One Tool

`fetch_instance_log.py` - Log Fetcher

`diagnose_log.py` - Log Analyzer