Brightdata Cli

v1.0.0

Guide for using the Bright Data CLI (`brightdata` / `bdata`) to scrape websites, search the web, extract structured data from 40+ platforms, manage proxy zon...

⭐ 0· 18·0 current·0 all-time

byMeir Kadosh@meirk-brd

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for meirk-brd/brightdata-brightdata-cli.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Brightdata Cli" (meirk-brd/brightdata-brightdata-cli) from ClawHub.
Skill page: https://clawhub.ai/meirk-brd/brightdata-brightdata-cli
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install brightdata-brightdata-cli

ClawHub CLI

Package manager switcher

npx clawhub@latest install brightdata-brightdata-cli

Security Scan

Capability signals

Requires OAuth tokenRequires sensitive credentials

These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.

VirusTotal

Benign

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The name/description (Bright Data CLI for scraping, SERP, pipelines, zones, budget) match the runtime instructions and the commands documented in references/*. The commands, pipeline types, and config paths all align with a CLI that manages Bright Data scraping and proxy zones.

ℹ

Instruction Scope

The SKILL.md instructs the agent/user to install and run the Bright Data CLI, authenticate (browser OAuth, device flow, or API key), and that the CLI will save credentials locally and auto-create proxy zones. These actions are expected for this tool, but they do include creating cloud-side resources (zones) and writing API keys to local config files (~/.config/brightdata-cli/ or equivalent), which is outside a purely read-only guidance scope and has operational impact (billing, stored secrets).

ℹ

Install Mechanism

There is no packaged install spec in the registry (instruction-only), but the SKILL.md suggests running 'curl https://cli.brightdata.com/install.sh | bash' and/or 'npm install -g @brightdata/cli' or 'npx --package @brightdata/cli'. Installing via an official vendor domain is coherent with the purpose, but piping a remote script to bash and performing global npm installs executes remote code on the host — a normal but higher-risk install pattern that the user should review/verify before running.

ℹ

Credentials

The skill declares no required env vars, which matches metadata, but the docs reference an optional BRIGHTDATA_POLLING_TIMEOUT and allow API key override flags. The CLI stores credentials locally (credentials.json with suggested mode 0o600) and uses configuration files; this is expected. No unrelated credentials or unrelated system paths are requested.

✓

Persistence & Privilege

The skill is not always-enabled and is user-invocable; it does not request elevated platform privileges. The only persistence described is the CLI storing credentials/config in the user's config directory and creating Bright Data proxy zones in the user's Bright Data account — both are consistent with the stated functionality.

Assessment

This skill is a documentation-only guide for the official Bright Data CLI and is coherent with that purpose. Before running anything the guide suggests: 1) Verify you trust the vendor/domain (https://brightdata.com / https://cli.brightdata.com) and inspect the install script instead of blindly piping it to bash. 2) Prefer 'npx' or installing in a controlled environment if you don't want a global npm install. 3) Understand that 'bdata login' will store an API key locally (~/.config/brightdata-cli/credentials.json) and the CLI may auto-create proxy zones — this can incur charges on your Bright Data account. 4) If you need least privilege, use a scoped API key with limited permissions and remove local credentials when done. 5) If you are installing on a shared or production machine, review the install script and consider using an isolated environment (container/VM).

Like a lobster shell, security has layers — review code before you run it.

latestvk97e14wkvdggp6d1b37vdt0kwn85pnwv

18downloads

0stars

1versions

Updated 4h ago

v1.0.0

MIT-0

Bright Data CLI

The Bright Data CLI (brightdata or bdata) gives you full access to Bright Data's web data platform from the terminal. It handles authentication, proxy zones, anti-bot bypass, CAPTCHA solving, and JavaScript rendering automatically — the user just needs to log in once.

Installation

If the CLI is not installed yet, guide the user:

macOS / Linux:

curl -fsSL https://cli.brightdata.com/install.sh | bash

Windows or manual install (any platform):

npm install -g @brightdata/cli

Without installing (one-off usage):

npx --yes --package @brightdata/cli brightdata <command>

Requires Node.js >= 20. After install, both brightdata and bdata (shorthand) are available.

First-Time Setup

Before anything else, check if the user is authenticated. If they haven't logged in yet, guide them through the one-time setup:

# One-time login — opens the browser for OAuth, then everything is automatic
bdata login

This single command:

Opens the browser for secure OAuth authentication
Saves the API key locally (never needs to be entered again)
Auto-creates required proxy zones (cli_unlocker, cli_browser)
Sets default configuration

After login, every subsequent command works without any manual intervention.

For headless/SSH environments where no browser is available:

bdata login --device

For direct API key authentication (non-interactive):

bdata login --api-key <key>

To verify setup is complete, run:

bdata config

Command Reference

Read references/commands.md for the full command reference with all flags, options, and examples for every command.

Read references/pipelines.md for the complete list of 40+ pipeline types (Amazon, LinkedIn, Instagram, TikTok, YouTube, Reddit, and more) with their specific parameters.

Quick Command Overview

bdata is the shorthand for brightdata. Both work identically.

Command	Purpose
`bdata scrape <url>`	Scrape any URL as markdown, HTML, JSON, or screenshot
`bdata search "<query>"`	Search Google/Bing/Yandex with structured results
`bdata pipelines <type> [params]`	Extract structured data from 40+ platforms
`bdata pipelines list`	List all 40+ available pipeline types
`bdata status <job-id>`	Check async job status
`bdata zones`	List proxy zones
`bdata budget`	View account balance and costs
`bdata skill add`	Install AI agent skills
`bdata skill list`	List available skills
`bdata config`	View/set configuration
`bdata login`	Authenticate with Bright Data
`bdata version`	Show CLI version and system info

How to Use Each Command

Scraping

Scrape any URL with automatic bot bypass, CAPTCHA handling, and JS rendering:

# Default: returns clean markdown
bdata scrape https://example.com

# Get raw HTML
bdata scrape https://example.com -f html

# Get structured JSON
bdata scrape https://example.com -f json

# Take a screenshot
bdata scrape https://example.com -f screenshot -o page.png

# Geo-targeted scrape from the US
bdata scrape https://amazon.com --country us

# Save to file
bdata scrape https://example.com -o page.md

# Async mode for heavy pages
bdata scrape https://example.com --async

Searching

Search engines with structured JSON output (Google returns parsed organic results, ads, People Also Ask, and related searches):

# Google search with formatted table
bdata search "web scraping best practices"

# Get raw JSON for piping
bdata search "typescript tutorials" --json

# Search Bing
bdata search "bright data pricing" --engine bing

# Localized search
bdata search "restaurants berlin" --country de --language de

# News search
bdata search "AI regulation" --type news

# Extract just URLs
bdata search "open source tools" --json | jq -r '.organic[].link'

Pipelines (Structured Data Extraction)

Extract structured data from 40+ platforms. These trigger async jobs that poll until results are ready:

# LinkedIn profile
bdata pipelines linkedin_person_profile "https://linkedin.com/in/username"

# Amazon product
bdata pipelines amazon_product "https://amazon.com/dp/B09V3KXJPB"

# Instagram profile
bdata pipelines instagram_profiles "https://instagram.com/username"

# Amazon search
bdata pipelines amazon_product_search "laptop" "https://amazon.com"

# YouTube comments (top 50)
bdata pipelines youtube_comments "https://youtube.com/watch?v=..." 50

# Google Maps reviews (last 7 days)
bdata pipelines google_maps_reviews "https://maps.google.com/..." 7

# Output as CSV
bdata pipelines amazon_product "https://amazon.com/dp/..." --format csv -o product.csv

# List all available pipeline types
bdata pipelines list

Checking Status

For async jobs (from --async scrapes or pipelines):

# Quick status check
bdata status <job-id>

# Wait until complete
bdata status <job-id> --wait

# With custom timeout
bdata status <job-id> --wait --timeout 300

Budget & Zones

# Quick account balance
bdata budget

# Detailed balance with pending charges
bdata budget balance

# All zones cost/bandwidth
bdata budget zones

# Specific zone costs
bdata budget zone my_zone

# Date range filter
bdata budget zones --from 2024-01-01T00:00:00 --to 2024-02-01T00:00:00

# List all zones
bdata zones

# Zone details
bdata zones info cli_unlocker

Configuration

# View all config
bdata config

# Set defaults
bdata config set default_zone_unlocker my_zone
bdata config set default_format json

Installing AI Agent Skills

# Interactive picker — choose skills and target agents
bdata skill add

# Install a specific skill
bdata skill add scrape

# List available skills
bdata skill list

Output Modes

Every command supports multiple output formats:

Flag	Effect
(none)	Human-readable formatted output with colors
`--json`	Compact JSON to stdout
`--pretty`	Indented JSON to stdout
`-o <path>`	Write to file (format auto-detected from extension)

When piped (stdout is not a TTY), colors and spinners are automatically disabled.

Chaining Commands

The CLI is pipe-friendly:

# Search → extract first URL → scrape it
bdata search "top open source projects" --json \
  | jq -r '.organic[0].link' \
  | xargs bdata scrape

# Scrape and view with markdown reader
bdata scrape https://docs.github.com | glow -

# Amazon product data to CSV
bdata pipelines amazon_product "https://amazon.com/dp/xxx" --format csv > product.csv

Environment Variables

These override stored configuration:

Variable	Purpose
`BRIGHTDATA_API_KEY`	API key (skips login entirely)
`BRIGHTDATA_UNLOCKER_ZONE`	Default Web Unlocker zone
`BRIGHTDATA_SERP_ZONE`	Default SERP zone
`BRIGHTDATA_POLLING_TIMEOUT`	Polling timeout in seconds

Troubleshooting

Error	Fix
CLI not found	Install with `npm i -g @brightdata/cli` or `curl -fsSL https://cli.brightdata.com/install.sh \| bash`
"No Web Unlocker zone specified"	`bdata config set default_zone_unlocker <zone>` or re-run `bdata login`
"Invalid or expired API key"	`bdata login`
"Access denied"	Check zone permissions in the Bright Data control panel
"Rate limit exceeded"	Wait and retry, or use `--async` for large jobs
Async job timeout	Increase with `--timeout 1200` or `BRIGHTDATA_POLLING_TIMEOUT=1200`

Key Design Principles

One-time auth: After bdata login, everything is automatic. No tokens to manage, no keys to pass.
Zones auto-created: Login creates cli_unlocker and cli_browser zones automatically.
Smart defaults: Markdown output, auto-detected formats from file extensions, colors only in TTY.
Pipe-friendly: JSON output + jq for automation. Colors/spinners disabled in pipes.
Async support: Heavy jobs can run in background with --async + status --wait.
npm package: @brightdata/cli — install globally or use via npx.

Comments

Loading comments...