Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

dataworks-diagnoser

v1.0.0

Fetch and analyze Alibaba Cloud DataWorks task instance logs to diagnose failures and get actionable recommendations using your instance ID and credentials.

0· 41·0 current·0 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Functionality (fetch DataWorks logs and analyze errors) matches the code: scripts call Alibaba Cloud DataWorks APIs and perform local log analysis. Asking for ALIBABA_CLOUD_ACCESS_KEY_ID / SECRET is appropriate for this purpose. However the registry metadata claims no required env vars or primary credential while SKILL.md and scripts clearly require credentials and Python SDKs — an inconsistency in declared requirements.
Instruction Scope
Runtime instructions and scripts limit operations to: loading credentials (env, ~/.alibabacloud/credentials or ./credentials.json), calling DataWorks APIs, parsing logs, and printing/saving reports. The SKILL.md does not instruct unrelated file reads or external endpoints beyond Alibaba Cloud. The only notable behavior is the scripts will search for local credential files (including credentials.json in working dir), which is expected but worth noting.
Install Mechanism
There is no platform-level install spec, and the code relies on standard pip packages (Alibaba Cloud SDKs) listed in requirements.txt — reasonable for the task. SKILL.md embeds a small install metadata snippet (suggesting brew curl) and README instructs pip installs. No downloads from arbitrary URLs or obfuscated installers were found. The minor inconsistency between 'no install spec' in registry metadata and install hints inside SKILL.md is worth correcting.
!
Credentials
The scripts legitimately need Alibaba Cloud AccessKey ID/Secret and network access to DataWorks endpoints. However the registry metadata omits these required environment variables and primary credential declaration. The skill also tries multiple locations for credentials (env, ~/.alibabacloud/credentials, ./credentials.json) — convenient but increases the chance of accidentally using unintended/local credentials. Ensure you supply a least-privilege RAM subaccount key and avoid leaving long-lived secrets in the working directory.
Persistence & Privilege
The skill does not request permanent 'always' presence, does not modify other skills or system-wide config, and only writes files if the user instructs (save-log / save-report). It runs helper scripts as subprocesses locally; no autonomous elevation or hidden persistence was observed.
What to consider before installing
This package is mostly coherent: it fetches DataWorks logs and analyzes them and does require your Alibaba Cloud AccessKey (ALIBABA_CLOUD_ACCESS_KEY_ID and _SECRET). Before installing or running: 1) Confirm the skill's source/provenance (author repo or signed release); 2) Use a RAM subaccount with minimal permissions (only GetTaskInstanceLog / QueryTask) rather than root account keys; 3) Don't keep credentials in the project directory (credentials.json) unless you intend to; prefer environment variables or a secured config; 4) Run in an isolated environment (virtualenv/container) and review/verify the pip packages (alibabacloud_* and aliyun sdk) you will install; 5) Consider rotating/revoking the AccessKey after use and inspect network traffic if you need to ensure keys are only used against Alibaba endpoints. The main concrete issue is the registry metadata omission — ask the publisher to update metadata to declare the required env vars and dependencies before trusting automated installation.

Like a lobster shell, security has layers — review code before you run it.

latestvk97d1e7exmw7mwnna770kcdrtn84wa58
41downloads
0stars
1versions
Updated 4d ago
v1.0.0
MIT-0

DataWorks Task Instance Diagnostician

Fetches task instance logs from Alibaba Cloud DataWorks API and provides intelligent diagnostic recommendations.

Quick Start

Diagnose a failed task:

python3 scripts/dataworks_diagnose.py <instance_id>

Example:

python3 scripts/dataworks_diagnose.py 123456789

When to Use

USE this skill when:

  • DataWorks task instance failed and you need to know why
  • You have an instance ID and need to fetch error logs
  • You want automated diagnosis and solutions for task failures
  • Troubleshooting ODPS SQL, Data Integration, Shell, Python nodes
  • Need to analyze error patterns across multiple failures
  • Preparing incident reports for failed tasks

When NOT to Use

DON'T use this skill when:

  • You need real-time task monitoring (use DataWorks console)
  • You want to modify task configurations (use console or API directly)
  • You need historical analytics across many tasks (use DataWorks reports)
  • The task is still running (wait for completion first)
  • You don't have Alibaba Cloud credentials (need AccessKey)

Prerequisites

1. Alibaba Cloud Credentials

One of the following is required:

Option A: Environment Variables (Recommended)

export ALIBABA_CLOUD_ACCESS_KEY_ID=your_access_key
export ALIBABA_CLOUD_ACCESS_KEY_SECRET=your_access_secret

Option B: Config File Create ~/.alibabacloud/credentials:

{
  "access_key_id": "your_access_key",
  "access_key_secret": "your_access_secret"
}

Option C: Aliyun CLI Config If you have Aliyun CLI configured, credentials will be loaded automatically.

2. Required Permissions

The AccessKey needs these permissions:

  • dataworks:GetInstanceLog - Fetch task instance logs
  • dataworks:QueryTask - Query task information

3. Network Access

  • Access to Alibaba Cloud API endpoints
  • If using VPC, ensure proper network configuration

Core Workflows

1. Quick Diagnosis (Recommended)

Fetch log and get diagnosis in one command:

python3 scripts/dataworks_diagnose.py <instance_id>

Example:

python3 scripts/dataworks_diagnose.py 123456789

Output:

🔍 开始诊断 DataWorks 任务实例:123456789
📍 区域:cn-hangzhou
------------------------------------------------------------

📥 步骤 1/2: 获取任务日志...
✅ 日志获取成功

🔬 步骤 2/2: 分析诊断中...
✅ 诊断完成

============================================================
📋 诊断报告
============================================================
🔍 DataWorks 任务实例诊断报告
============================================================
实例 ID: 123456789
发现问题数:2

----------------------------------------------------------------------
🔴 问题 1: 资源配额不足
   类型:resource_quota
   严重程度:HIGH
   
   相关日志:
     > ERROR: quota exceeded for resource group 'default'
     > No available slots in queue
   
   建议解决方案:
     1. 检查当前资源组的使用情况,释放闲置资源
     2. 联系管理员提升资源配额
     3. 优化任务配置,减少资源消耗
     4. 考虑错峰调度,避开资源使用高峰
   
   参考文档:https://help.aliyun.com/.../resource-group.html

2. Fetch Log Only

python3 scripts/fetch_instance_log.py <instance_id> [options]

Options:

# Specify region
python3 scripts/fetch_instance_log.py 123456789 --region cn-shanghai

# Output as JSON
python3 scripts/fetch_instance_log.py 123456789 --json

# Show full log (default: last 50 lines)
python3 scripts/fetch_instance_log.py 123456789 --verbose

# Save to file
python3 scripts/fetch_instance_log.py 123456789 > log.txt

3. Diagnose Existing Log

python3 scripts/diagnose_log.py <log_file>

Examples:

# From file
python3 scripts/diagnose_log.py error.log

# From stdin
cat log.txt | python3 scripts/diagnose_log.py

# With instance ID
python3 scripts/diagnose_log.py error.log --instance-id 123456789

# JSON output
python3 scripts/diagnose_log.py error.log --json

# Summary only
python3 scripts/diagnose_log.py error.log --summary

Scripts

This skill includes three scripts:

dataworks_diagnose.py - All-in-One Tool

Fetches log and provides diagnosis automatically.

Usage:

python3 scripts/dataworks_diagnose.py <instance_id> [options]

Options:

  • --region, -r - Alibaba Cloud region (default: cn-hangzhou)
  • --json, -j - Output as JSON
  • --verbose, -v - Show full log
  • --save-log FILE - Save raw log to file
  • --save-report FILE - Save diagnostic report to file

fetch_instance_log.py - Log Fetcher

Fetches task instance log from DataWorks API.

Usage:

python3 scripts/fetch_instance_log.py <instance_id> [options]

Options:

  • --region, -r - Region (default: cn-hangzhou)
  • --access-key - Access Key ID
  • --access-secret - Access Key Secret
  • --json, -j - JSON output
  • --verbose, -v - Full log

diagnose_log.py - Log Analyzer

Analyzes log content and provides diagnostic recommendations.

Usage:

python3 scripts/diagnose_log.py <log_file_or_stdin> [options]

Options:

  • --instance-id - Task instance ID
  • --json, -j - JSON output
  • --summary, -s - Summary only

Detected Error Patterns

The diagnostician recognizes these error types:

Error TypeSeverityExamples
🔴 resource_quotaHigh"quota exceeded", "资源不足"
🔴 resource_expiredHigh"expired", "独享资源组已过期", "bill exception"
🔴 connection_timeoutHigh"connection timeout", "network unreachable"
🔴 permission_deniedHigh"permission denied", "access denied"
🟡 syntax_errorMedium"syntax error", "parse error"
🟡 table_not_foundMedium"table not found", "doesn't exist"
🟡 data_qualityMedium"quality check failed"
🔴 memory_overflowHigh"out of memory", "heap space"
🔴 disk_fullHigh"disk full", "no space left"
🟡 dependency_failedMedium"dependency failed", "upstream failed"
🟡 api_rate_limitMedium"rate limit exceeded"

See references/error_codes.md for detailed error patterns and solutions.

Common Regions

RegionCode
华东 1 (杭州)cn-hangzhou
华东 2 (上海)cn-shanghai
华北 1 (青岛)cn-qingdao
华北 2 (北京)cn-beijing
华南 1 (深圳)cn-shenzhen
香港cn-hongkong
新加坡ap-southeast-1

API Reference

API: GetTaskInstanceLog
Version: 2024-05-18
Endpoint: https://dataworks-public.{region}.aliyuncs.com/

Request Parameters:

  • InstanceId (required) - Task instance ID
  • RegionId (required) - Region ID

Response:

{
  "Data": {
    "LogContent": "...",
    "InstanceStatus": "FAILED",
    "CycleTime": "2024-01-15 10:30:00"
  },
  "Code": "200"
}

Documentation: https://api.aliyun.com/api/dataworks-public/2024-05-18/GetTaskInstanceLog

Examples

Example 1: Quick Diagnosis

python3 scripts/dataworks_diagnose.py 123456789

Example 2: Save Report

python3 scripts/dataworks_diagnose.py 123456789 --save-report diagnosis.txt

Example 3: Different Region

python3 scripts/dataworks_diagnose.py 123456789 --region cn-shanghai

Example 4: Analyze Saved Log

python3 scripts/diagnose_log.py saved_log.txt --instance-id 123456789

Example 5: Batch Analysis

for id in 123 456 789; do
  python3 scripts/diagnose_log.py --instance-id $id < log_$id.txt
done

Troubleshooting

"Credentials not found"

# Set environment variables
export ALIBABA_CLOUD_ACCESS_KEY_ID=your_key
export ALIBABA_CLOUD_ACCESS_KEY_SECRET=your_secret

"Instance not found"

  • Verify the instance ID is correct
  • Check if the instance exists in DataWorks console
  • Ensure you're using the correct region

"Permission denied"

  • Verify AccessKey has required permissions
  • Check RAM role configuration
  • Contact administrator for access

"Request timeout"

  • Check network connectivity
  • Try increasing timeout in script
  • Verify API endpoint is accessible

Tips

💡 Pro tips:

  1. Save logs for failed tasks - Use --save-log to keep records
  2. Generate reports - Use --save-report for documentation
  3. Batch processing - Script supports multiple instance IDs
  4. JSON output - Use --json for programmatic processing
  5. Region matters - Always use the correct region for your workspace

Security

⚠️ Important:

  • Never commit AccessKeys to version control
  • Use RAM roles instead of main account keys
  • Rotate keys regularly
  • Use environment variables or secure config files
  • Restrict key permissions to minimum required

References

Comments

Loading comments...