Install
openclaw skills install alibabacloud-dataworks-data-qualityDataWorks Data Quality (Read-Only): Query rule templates, data quality monitors (scans), alert rules, and scan run records/logs. Uses aliyun CLI to call dataworks-public OpenAPI (2024-05-18). All operations are read-only — no create, update, or delete. Trigger keywords: DataWorks data quality, quality rule, quality template, quality monitor, quality scan, scan run, quality check result, quality alert rule, quality run log, DQ monitor, data quality execution, quality pass/fail, list quality scans, get quality scan, query quality result, quality monitoring detail, quality run history. Not triggered: creating/updating/deleting quality rules or monitors, data source management, compute resource management, resource group management, workspace member management, data development tasks, scheduling configuration.
openclaw skills install alibabacloud-dataworks-data-qualityQuery and investigate Rule Templates, Data Quality Monitors, Alert Rules, and Scan Run Records in Alibaba Cloud DataWorks.
Coverage: All Get/List read-only OpenAPIs under DataWorks Data Quality, totaling 9: ListDataQualityTemplates / GetDataQualityTemplate · ListDataQualityScans / GetDataQualityScan · ListDataQualityAlertRules / GetDataQualityAlertRule · ListDataQualityScanRuns / GetDataQualityScanRun / GetDataQualityScanRunLog Excludes write operations: Create / Update / Delete / CreateDataQualityScanRun.
Read-Only Skill: This skill supports query operations only. Any write operation request must be blocked immediately — direct the user to the DataWorks console.
DataWorks Data Quality
├── Rule Templates ─── Reusable metric logic definitions (built-in & custom)
│
├── Data Quality Monitors (Scans) ─── Monitor tasks bound to tables, with rules and trigger config
│ └── Alert Rules ─── Notification rules tied to a monitor (channels, recipients, conditions)
│
└── Scan Runs ─── Execution records each time a monitor runs
└── Scan Run Logs ─── Detailed execution logs for a run
aliyun version (If not installed or version too low, run curl -fsSL https://aliyuncli.alicdn.com/setup.sh | bash to install/update. See references/cli-installation-guide.md)aliyun configure set --auto-plugin-install truealiyun plugin update to ensure that any existing plugins on your local machine are always up-to-date.aliyun configure ai-mode enablealiyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality"aliyun configure ai-mode disablewhich jqaliyun configure list, verify valid credentials existSecurity Rules: DO NOT read/print/echo AK/SK values. ONLY use
aliyun configure listto check credential status.
aliyun CLI commands must include --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality.aliyun CLI commands must include --connect-timeout 5 --read-timeout 10. These match the CLI built-in defaults and make the timeout policy explicit.\ for line breaks.aliyun command to get JSON, then pipe to jq for formatting.--region, you must also add --endpoint dataworks.<REGION_ID>.aliyuncs.com.Must be explicitly provided by user — do not assume or use defaults:
ProjectId: Core parameter for every query — must be confirmedId-type resource identifiers: template ID, monitor ID, alert rule ID, scan run IDregion: Affects endpoint — must be confirmedCan use default values directly — no user confirmation needed:
PageNumber: default 1PageSize: default 10SortBy: default ModifyTime Desc or CreateTime DescAsk contextually — only collect when the user has a specific need:
Name, Table: fuzzy search keywordsCreateTimeFrom / CreateTimeToStatus: collect only when the user explicitly wants to filter by a specific statusIf the user has already provided
ProjectId,Id, orregionin the conversation, reuse them directly without re-confirmation.
When the user describes time in natural language, convert it to millisecond timestamps automatically. Do not ask the user to provide raw timestamps.
"yesterday" → yesterday 00:00:00 to 23:59:59"today" → today 00:00:00 to current time"last N days" → current time minus N × 24 hours through current timeAfter every query, present the result in a decision-friendly way:
Spec only if the user asksFail / Error / Warn, and proactively recommend the next diagnostic stepProjectId, wrong region, or filters that are too strictPageSize of 10PageSize, proactively offer next page or a larger PageSize100 records in a single requestMANDATORY: Before responding to ANY request, check whether it involves a write operation. If YES: BLOCK immediately. Do NOT call any API. Respond with: "This skill supports query operations only and cannot perform create/update/delete. Please go to the DataWorks Console for configuration."
Quick Reference — All Blocked Operations:
| Operation Type | Blocked APIs |
|---|---|
| Create | CreateDataQualityTemplate, CreateDataQualityScan, CreateDataQualityScanRun, CreateDataQualityAlertRule |
| Update | UpdateDataQualityTemplate, UpdateDataQualityScan, UpdateDataQualityAlertRule |
| Delete | DeleteDataQualityTemplate, DeleteDataQualityScan, DeleteDataQualityAlertRule |
| Trigger | CreateDataQualityScanRun (manual execution trigger) |
All operations require dataworks:<APIAction> permissions on the target workspace.
Full permission matrix: references/ram-policies.md
When the user request is vague, use the following default path:
ProjectId and region. If either is missing, use Module 0. After completion, proactively suggest listing monitors.ListDataQualityScans, present a table, and let the user choose a monitor. After completion, proactively suggest monitor detail.GetDataQualityScan, summarize rules, monitored object, and trigger mode. After completion, proactively suggest recent runs.ListDataQualityScanRuns, default to the most recent 10 rows, and highlight abnormal status. After completion, proactively suggest drilling into one run.Fail / Error / Warn, call GetDataQualityScanRun and summarize per-rule results. After completion, proactively suggest log inspection.Results shows failed rules or runtime errors, call GetDataQualityScanRunLog to locate root cause. After completion, proactively suggest whether further analysis is needed.| Completed Operation | Recommended Next Step |
|---|---|
| ListDataQualityTemplates | "Would you like to view the full configuration of a specific template? (GetDataQualityTemplate)" |
| GetDataQualityTemplate | "Would you like to view monitors that use this template? (ListDataQualityScans)" |
| ListDataQualityScans | "Select a monitor to view its full configuration? (GetDataQualityScan)" |
| GetDataQualityScan | "View associated alert rules (ListDataQualityAlertRules) or recent run history (ListDataQualityScanRuns)?" |
| ListDataQualityAlertRules | "View details for a specific alert rule? (GetDataQualityAlertRule)" |
| GetDataQualityAlertRule | "Return to view run history for the associated monitor? (ListDataQualityScanRuns)" |
| ListDataQualityScanRuns | "View detailed results for a specific run? (GetDataQualityScanRun)" |
| GetDataQualityScanRun (Pass) | "This run passed. Would you like to view other run records or alert configuration?" |
| GetDataQualityScanRun (Fail/Error/Warn) | "Anomaly detected — recommend viewing execution logs to locate the root cause. (GetDataQualityScanRunLog)" |
| GetDataQualityScanRunLog (NextOffset=-1) | "Log retrieval complete. Is further analysis needed?" |
| GetDataQualityScanRunLog (NextOffset≠-1) | "Log not fully retrieved — continue fetching the next segment. (Retry with Offset)" |
Trigger scenarios: Query data quality monitors/rules/templates/alerts/scan runs/logs, diagnose data quality check failures, view quality alert notification configuration, list/get quality scan/rule/template/alert/run
Not triggered:
alibabacloud-dataworks-infra-managealibabacloud-dataworks-workspace-managealibabacloud-dataworks-datastudio-developIdentify query intent → Environment check → Module 0 (if ProjectId/region missing) → Collect parameters → Execute command → Present results → Guide next step
Common aliases: DW = DataWorks, DQ = Data Quality, scan = monitor, scan run = execution record
If the
alibabacloud-dataworks-workspace-manageskill is available, prefer using it for workspace lookup. The following is only a fallback.
aliyun dataworks-public list-projects --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality --status Available --page-size 100
Rules:
ProjectIdProjectId is unknown, ask for it explicitly and never guessregion is unknown, offer common regions for confirmation: cn-hangzhou, cn-shanghai, cn-beijing, cn-shenzhenProjectId and region are confirmed in the conversation, reuse them in later stepsIntent guidance:
"there's a data quality issue" → ask whether the user wants monitor configuration, run records, or alert settings"show me this table" → start with list-data-quality-scans --table <TABLE_NAME>Rule templates define reusable metric logic such as null rate, duplicate rate, row count, and custom SQL checks. Use this module when the user wants to know what a template checks, whether it is built-in or workspace-specific, and how its threshold logic is defined.
Always call
ListDataQualityTemplateswhenever the user asks about quality rule templates in their workspace. Never answer without invoking the API.Scope: This API only returns workspace custom templates. It does not support querying system built-in templates.
--project-idis required — if the user has not providedProjectId, collect it first via Module 0.
aliyun dataworks-public list-data-quality-templates --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --project-id <PROJECT_ID> [--name <FUZZY_NAME>] [--catalog <CATALOG_PATH>] [--page-number 1] [--page-size 10]
How to interpret the result:
PageInfo.DataQualityTemplates[] is the working set for user selectionId, template name (from Spec), Catalog/category, and description — do not dump raw JSONCatalog and template naming patterns to tell the user what class of checks is availableGetDataQualityTemplatealiyun dataworks-public get-data-quality-template --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --id <TEMPLATE_ID>
How to interpret the result:
Spec: summarize the metric logic, parameter definitions, and threshold expressionProjectId present) or is reused as a generic templateSpec only when the user explicitly asks for raw detailA data quality monitor (scan) is a concrete monitoring task bound to a table or field. Use this module to locate monitors, explain what they check, and understand how they are triggered.
aliyun dataworks-public list-data-quality-scans --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --project-id <PROJECT_ID> --page-number 1 --page-size 10 [--name <FUZZY_NAME>] [--table <FUZZY_TABLE_NAME>] [--sort-by "ModifyTime Desc"]
How to interpret the result:
PageInfo.DataQualityScans[] is the candidate monitor list; show Id, Name, Description, owner, and latest update time--Table is used, explicitly tell the user these monitors are the likely matches for that tableProjectId, region, or relaxing Name / Tablealiyun dataworks-public get-data-quality-scan --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --id <SCAN_ID>
How to interpret the result:
Spec: summarize monitored object, rule count, core metrics, and threshold settingsTrigger: explain whether the monitor is ByManual or ByScheduleComputeResource and RuntimeResource: mention them only when they help explain execution behaviorParameters and Hooks: summarize only if they affect how the run is triggered or analyzedAlert rules define when notifications are sent and to whom. Use this module when the user asks who gets notified, through which channel, and under what condition.
Receiver Type Quick Reference
| ReceiverType | Description |
|---|---|
| AliUid | Specific Alibaba Cloud account UID |
| DataQualityScanOwner | Owner of the data quality monitor task |
| TaskOwner | Owner of the associated scheduling task |
| DingdingUrl | DingTalk custom robot Webhook |
| FeishuUrl | Feishu custom robot Webhook |
| WeixinUrl | WeCom Webhook |
| WebhookUrl | Generic Webhook URL |
| ShiftSchedule | On-call schedule (notify by shift) |
aliyun dataworks-public list-data-quality-alert-rules --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --project-id <PROJECT_ID> --page-number 1 --page-size 10 [--data-quality-scan-id <SCAN_ID>] [--sort-by "CreateTime Desc"]
How to interpret the result:
PageInfo.DataQualityAlertRules[] should be summarized as: rule ID, condition, channels, receivers, and associated monitor IDsNotification.Channels into user-friendly channel names such as DingTalk, email, Feishu, SMS, or WebhookNotification.Receivers by receiver type instead of showing nested raw JSONDataQualityScanId is provided, explicitly state these are the alert rules attached to that monitoraliyun dataworks-public get-data-quality-alert-rule --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --id <ALERT_RULE_ID>
How to interpret the result:
A scan run is created every time a monitor executes. Use this module to inspect run history, diagnose failed checks, and read execution logs.
Status Quick Reference
| Status | Meaning | Recommended Path |
|---|---|---|
| Pass | All rules passed | No action needed |
| Fail | At least one rule failed to meet the threshold | GetDataQualityScanRun → Results → GetDataQualityScanRunLog |
| Error | Execution error (engine error, insufficient resources) | GetDataQualityScanRunLog to view error details |
| Warn | Warning triggered but did not reach the blocking threshold | GetDataQualityScanRun → Results to view metric values |
| Running | Execution in progress | Wait for completion before querying |
aliyun dataworks-public list-data-quality-scan-runs --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --project-id <PROJECT_ID> [--data-quality-scan-id <SCAN_ID>] [--status <Pass|Running|Error|Fail|Warn>] [--create-time-from <TIMESTAMP_MS>] [--create-time-to <TIMESTAMP_MS>] [--filter '{"TaskInstanceId":"<INSTANCE_ID>"}'] [--sort-by "CreateTime Desc"] [--page-number 1] [--page-size 20]
Filter quick reference:
| Scenario | Filter JSON Example |
|---|---|
| Filter by scheduling instance | {"TaskInstanceId":"123456"} |
| Filter by run number | {"RunNumber":"2"} |
How to interpret the result:
PageInfo.DataQualityScanRuns[] should be shown as a table with Id, Status, CreateTime, FinishTime, and key runtime parametersFail, Error, and Warn, then recommend drilling into GetDataQualityScanRunStatus=Fail with a converted time range instead of asking for timestampsaliyun dataworks-public get-data-quality-scan-run --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --id <SCAN_RUN_ID>
How to interpret the result:
Status: state clearly whether the run passed, failed, warned, errored, or is still runningResults: extract each rule's status, actual metric value, threshold, and whether it caused the overall failure; present this as a table instead of raw JSONScan: use it as configuration snapshot context only when it helps explain the failureParameters: mention runtime parameters when they may have influenced the resultGetDataQualityScanRunLogaliyun dataworks-public get-data-quality-scan-run-log --user-agent AlibabaCloud-Agent-Skills/alibabacloud-dataworks-data-quality [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --id <SCAN_RUN_ID> [--offset <BYTE_OFFSET>]
How to interpret the result:
Log is the raw execution trace; summarize the root cause first, then provide key excerpts if neededNextOffset = -1 means log retrieval is completeNextOffset != -1, continue querying with the returned offset until completion when the user asks for the full logFail, check GetDataQualityScanRun results first, then read GetDataQualityScanRunLog.ListDataQualityTemplates returns workspace custom templates only; ProjectId is required. Built-in templates must be viewed in the DataWorks console.Spec on demand — Spec is often verbose. Summarize first, expand only on request.ProjectId, wrong region, or overly strict filters — suggest confirming parameters or relaxing filter conditions.Fail / Error / Warn, do not just display the status — proactively provide the next diagnostic path.| Error Code | Solution |
|---|---|
| Forbidden.Access / PermissionDenied | Check RAM permissions, see references/ram-policies.md |
| InvalidParameter | Verify parameter names, JSON shape, and required fields |
| EntityNotExists | Check whether the ID, ProjectId, and region match the target resource |
| InvalidPageSize | PageSize must be within the API-supported range, usually 1-100 |
Common: cn-hangzhou, cn-shanghai, cn-beijing, cn-shenzhen.
Endpoint format: dataworks.<REGION_ID>.aliyuncs.com
Full region and endpoint list: references/related-apis.md
| Reference | Description |
|---|---|
| references/ram-policies.md | RAM permission configuration and policy examples |
| references/related-apis.md | API parameter details and Region Endpoints |
| references/cli-installation-guide.md | Aliyun CLI installation guide |