数据管道工具箱

PassAudited by ClawScan on May 5, 2026.

Overview

This is a documentation-only ETL skill whose broad data, credential, scheduling, and alerting examples fit its purpose but should be used only with verified tooling and scoped access.

Before installing or using this skill, verify what ./pipeline.sh is, confirm the correct package slug/source, use least-privilege credentials, approve each data source and destination, and make sure any schedules or webhooks can be disabled.

Findings (4)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

If a user or agent runs the examples, it may depend on an unreviewed local script or a differently named package.

Why it was flagged

The evaluated registry slug is data-pipeline-toolkit-v2, while the docs reference data-pipeline-toolkit, and the provided manifest contains no pipeline.sh helper. This is a packaging/provenance ambiguity rather than proof of malicious behavior.

Skill content
clawhub install data-pipeline-toolkit ... ./pipeline.sh create my-pipeline
Recommendation

Verify the intended ClawHub package, source, and the contents of any pipeline.sh executable before running ETL commands.

What this means

Over-privileged database or warehouse credentials could let a pipeline read or write more data than intended.

Why it was flagged

The examples use database connection strings and can load data into external systems. That is expected for an ETL skill, but the metadata declares no required credentials or scoping guidance.

Skill content
./pipeline.sh load my-pipeline postgres --connection $DATABASE_URL ... ./pipeline.sh load user-logs clickhouse --connection $CH_URL
Recommendation

Use dedicated least-privilege credentials, test on non-production data first, and confirm exactly which sources and destinations are used.

What this means

A scheduled pipeline could keep transferring or transforming data until disabled.

Why it was flagged

The skill explicitly supports scheduled recurring pipelines. This persistence is purpose-aligned, but scheduled jobs can continue acting after initial setup.

Skill content
定时调度:Cron任务或事件触发 ... ./pipeline.sh schedule daily-sales "0 6 * * *"
Recommendation

Create schedules only after confirming the source, destination, frequency, owner, monitoring, and how to disable or roll back the job.

What this means

Pipeline names, error details, or operational metadata could be sent to third-party notification channels.

Why it was flagged

The monitoring examples send alerts to email or a webhook. This is expected for failure notification, but the artifact does not specify what data is included in alerts.

Skill content
./pipeline.sh alert my-pipeline email --to admin@example.com ... ./pipeline.sh alert my-pipeline webhook --url "https://open.feishu.cn/..."
Recommendation

Review alert contents, avoid sending secrets or raw records in notifications, and use trusted webhook destinations.