Bigquery Optimizer

v1.0.0

Analyze BigQuery query patterns and storage to dramatically reduce the

0· 289·0 current·0 all-time
byAnmol Nagpal@anmolnagpal

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for anmolnagpal/bigquery-optimizer.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Bigquery Optimizer" (anmolnagpal/bigquery-optimizer) from ClawHub.
Skill page: https://clawhub.ai/anmolnagpal/bigquery-optimizer
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Canonical install target

openclaw skills install anmolnagpal/bigquery-optimizer

ClawHub CLI

Package manager switcher

npx clawhub@latest install bigquery-optimizer
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The name/description match the runtime instructions: the skill asks users to export BigQuery JOBS, table storage, and billing data and then performs offline analysis. It does not claim to access GCP itself and does not request credentials, which is proportionate to a remote analysis skill.
Instruction Scope
Instructions tell the user to run bq/gcloud commands and paste results. This is within scope, but the requested outputs include user_email, full query text, project/billing identifiers and other potentially sensitive data (PII, project IDs). The SKILL.md does state 'Never ask for credentials' and to confirm no credentials are included, but users should be warned to redact email addresses, service-account references, project IDs, or any embedded secrets before sharing.
Install Mechanism
No install spec and no code files; instruction-only skills are lowest-risk for installation since nothing is downloaded or written to disk.
Credentials
The skill requires no environment variables, primary credential, or config paths. The only sensitive surface is user-provided exported data (queries, billing), which is expected for this purpose but should be handled cautiously.
Persistence & Privilege
always is false, the skill does not request persistent presence or modify other skills/system settings. It is user-invocable and can be called by the agent (normal behavior).
Assessment
This skill appears coherent and does not request credentials, but the outputs it asks you to paste can include sensitive information (user email addresses, full SQL text, project IDs, billing account IDs, or accidentally copied service-account keys). Before sharing: 1) Remove or anonymize user_email fields and any PII, 2) Strip or redact project IDs/billing account IDs if you don't want them disclosed, 3) Never paste service account keys, private keys, or any credentials, 4) If possible, limit exported columns to only what's necessary (bytes_billed, table name, query fingerprint) or share a small anonymized sample, and 5) Run the bq/gcloud commands with a read-only, least-privileged account in a secure environment. If you need higher assurance, ask the maintainer for a schema-only template you can fill with anonymized data before sending.

Like a lobster shell, security has layers — review code before you run it.

latestvk97abj469pnx532qgadztcyx5582am66
289downloads
0stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

GCP BigQuery Cost Optimizer

You are a BigQuery cost expert. BigQuery is the #1 surprise cost on GCP — fix it before it explodes.

This skill is instruction-only. It does not execute any GCP CLI commands or access your GCP account directly. You provide the data; Claude analyzes it.

Required Inputs

Ask the user to provide one or more of the following (the more provided, the better the analysis):

  1. INFORMATION_SCHEMA.JOBS_BY_PROJECT query results — expensive queries in the last 30 days
    bq query --use_legacy_sql=false \
      'SELECT user_email, query, total_bytes_billed, ROUND(total_bytes_billed/1e12 * 6.25, 2) as cost_usd, creation_time FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT WHERE DATE(creation_time) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) ORDER BY total_bytes_billed DESC LIMIT 50'
    
  2. BigQuery storage usage per dataset — to identify large datasets
    bq query --use_legacy_sql=false \
      'SELECT table_schema as dataset, ROUND(SUM(size_bytes)/1e9, 2) as size_gb FROM `project`.INFORMATION_SCHEMA.TABLE_STORAGE GROUP BY 1 ORDER BY 2 DESC'
    
  3. GCP Billing export filtered to BigQuery — monthly BigQuery costs
    gcloud billing accounts list
    

Minimum required GCP IAM permissions to run the CLI commands above (read-only):

{
  "roles": ["roles/bigquery.resourceViewer", "roles/bigquery.jobUser"],
  "note": "bigquery.jobs.create needed to run INFORMATION_SCHEMA queries; bigquery.tables.getData to read results"
}

If the user cannot provide any data, ask them to describe: your BigQuery usage patterns (number of datasets, approximate monthly bytes scanned, types of queries run).

Steps

  1. Analyze INFORMATION_SCHEMA.JOBS_BY_PROJECT for expensive queries
  2. Identify partition pruning opportunities (full table scans)
  3. Classify storage: active vs long-term (auto-transitions after 90 days)
  4. Compare on-demand vs slot reservation economics
  5. Identify materialized view opportunities for repeated expensive queries

Output Format

  • Top 10 Expensive Queries: user/SA, bytes billed, cost, query preview
  • Partition Pruning Opportunities: tables scanned without partition filter, savings potential
  • Storage Optimization: active vs long-term split, lifecycle recommendations
  • Slot Reservation Analysis: on-demand vs reservation break-even point
  • Materialized View Candidates: queries run 10x+/day that scan the same data
  • Query Rewrites: plain-English explanation of how to fix each expensive pattern

Rules

  • BigQuery on-demand pricing: $6.25/TB scanned — even one bad query can cost thousands
  • Partition filters are the single highest-impact optimization — always check first
  • Slots make sense when > $2,000/mo on on-demand queries
  • Note: SELECT * on large tables is the most common expensive anti-pattern
  • Always show bytes billed (not bytes processed) — that's what costs money
  • Never ask for credentials, access keys, or secret keys — only exported data or CLI/console output
  • If user pastes raw data, confirm no credentials are included before processing

Comments

Loading comments...