Skylv Data Pipeline Builder

Build ETL/data pipelines with natural language. Extract from databases/APIs, transform with code, load to destinations. No pipeline framework expertise needed.

Audits

Warn

Install

openclaw skills install skylv-data-pipeline-builder

data-pipeline-builder

Build data pipelines without framework expertise. Extract from any source, transform with code, load to any destination — all with natural language commands.

What It Does

  • Extract data — From databases, APIs, files, S3, GCS, Kafka
  • Transform — Filters, mappings, aggregations, joins, custom code
  • Load — To databases, data warehouses, files, APIs
  • Schedule — Cron-based or event-triggered execution
  • Monitor — Pipeline status, throughput, error rates
  • Validate — Schema checks, data quality rules

Quick Start

# 1. Create a simple pipeline
create pipeline from mysql users to postgres users_backup

# 2. Add transformation
add transform to users-backup: filter where active = true

# 3. Schedule it
schedule users-backup daily at 2:00 AM

# 4. Run and monitor
run pipeline users-backup
check pipeline status

Common Use Cases

🔄 Database Synchronization

# Sync production to analytics warehouse
create pipeline from mysql production.orders \
  to bigquery analytics.orders

# Run incremental sync every hour
schedule orders-sync hourly

📊 API Data Extraction

# Pull data from REST API
create pipeline from api https://api.shop.com/orders \
  to postgres analytics.orders

# Add authentication
set source auth: bearer token xxx

🧹 Data Cleaning

# Clean and transform data
create pipeline from csv raw_data.csv to postgres clean_data

add transform: \
  remove duplicates on email \
  fill nulls in age with 0 \
  validate email format

📈 Analytics Preparation

# Aggregate for dashboards
create pipeline from postgres transactions \
  to postgres daily_summary

add transform: \
  group by date, product \
  aggregate sum(revenue), count(*) \
  where date >= yesterday

All Commands

CommandPurpose
create pipeline from <src> to <dst>Define new pipeline
add transform <pipeline>Add transformation step
schedule <pipeline> <when>Set run schedule
run pipeline <name>Execute immediately
check pipeline statusView running pipelines
pause pipeline <name>Stop scheduled runs
view logs <pipeline>See execution history
validate <pipeline>Test without executing

Supported Sources & Destinations

Databases: MySQL, PostgreSQL, MongoDB, Redis, SQLite

Cloud Storage: S3, GCS, Azure Blob

Data Warehouses: BigQuery, Snowflake, Redshift

Streaming: Kafka, Kinesis, Pub/Sub

Files: CSV, JSON, Parquet, Excel


Requirements

  • Node.js 18+ or Python 3.8+
  • Source/destination connectors (auto-installed)
  • Optional: Airflow, Dagster for orchestration