{"skill":{"slug":"plydb","displayName":"plydb","summary":"Skill for using the PlyDB CLI to perform SQL analysis of connected data sources. Use for SQL queries across heterogeneous databases and files such as Postgre...","description":"---\nname: plydb\ndescription:\n  Skill for using the PlyDB CLI to perform SQL analysis of connected data\n  sources. Use for SQL queries across heterogeneous databases and files such as\n  Postgres, MySQL, CSV, Parquet, JSON, Excel, SQLite, DuckDB, Google Sheets.\n  Triggers on \"plydb\", \"sql\", \"query\", \"data analysis\", \"parquet\", \"csv\",\n  \"excel\", \"database\".\n---\n\n# PlyDB CLI skill\n\nThe `plydb` CLI can be used to query across heterogenous data sources.\n\n## Dependencies\n\nThe `plydb` binary must be available on the system.\n\nIf it is not, installation instructions can be found\n[here](https://github.com/kineticloom/plydb?tab=readme-ov-file#installation)\n\n## Instructions\n\n### Configure data sources\n\nFirst, the data sources to make available to PlyDB must be configured in a\nconfig file as per the specification in `references\\config_schema.md`.\n\n### Query with SQL\n\nOnce you have a data source config file, PlyDB can query across all of the\nconfigured data sources. Use fully qualified table names: catalog.schema.table.\n\n```sh\nplydb query \\\n  --config path/to/config/file/config.json \\\n  \"SELECT * FROM customers.default.customers c\n   JOIN orders.default.orders o\n   ON c.id = o.customer_id\"\n```\n\n### Fetching semantic context of the data\n\nTo provide context to understand the domain and write correct SQL - PlyDB can\nbuild and provide semantic context from database `COMMENT` metadata alongside\ncolumn types and foreign keys as structured YAML that follows the\n[Open Semantic Interchange (OSI)](https://github.com/open-semantic-interchange/OSI)\nspecification.\n\n```sh\nplydb semantic-context --config path/to/config/file/config.json\n```\n\n#### Enriching auto-scanned context with overlays\n\nWhen the database lacks comments or you need to add relationships and metrics\nnot captured from source metadata, use `--semantic-context-overlay` to supply\none or more OSI YAML files that are merged on top of the auto-scanned model:\n\n```sh\nplydb semantic-context \\\n  --config path/to/config/file/config.json \\\n  --semantic-context-overlay path/to/overlay.yaml\n```\n\nThe flag is repeatable; overlays are applied in the order given:\n\n```sh\nplydb semantic-context \\\n  --config path/to/config/file/config.json \\\n  --semantic-context-overlay base_overlay.yaml \\\n  --semantic-context-overlay team_overlay.yaml\n```\n\nOverlay files must be valid\n[Open Semantic Interchange (OSI)](https://github.com/open-semantic-interchange/OSI)\nYAML.\n\nOverlays can add descriptions to existing datasets and fields, define\nrelationships between existing datasets, and add or update metrics. They cannot\nintroduce new datasets or fields - only enrich what was already discovered by\nthe auto-scanner.\n\nGood opportunities to create or edit an overlay file are when encountering a new\ndataset or after a session of data analysis with the user. These are great\nopportunities to distill your learnings about the data's semantics and record\nthem into an overlay file for future sessions. Ask the user first.\n\n#### Embedding overlays in the config file\n\nOverlays can also be specified in the config file under\n`semantic_context.overlays` instead of (or in addition to) the CLI flag:\n\n```json\n{\n  \"databases\": { ... },\n  \"semantic_context\": {\n    \"overlays\": [\n      \"path/to/base_overlay.yaml\",\n      \"path/to/team_overlay.yaml\"\n    ]\n  }\n}\n```\n\nWith overlays in the config, no extra flags are needed:\n\n```sh\nplydb semantic-context --config path/to/config.json\n```\n\nConfig-file overlays are applied before any `--semantic-context-overlay` flags.\n\n## Troubleshooting\n\n- [gsheet data source with interactive OAuth](./references/troubleshooting.md#gsheet-data-source-with-interactive-oauth)\n","tags":{"latest":"1.0.0"},"stats":{"comments":0,"downloads":541,"installsAllTime":0,"installsCurrent":0,"stars":0,"versions":1},"createdAt":1773414519684,"updatedAt":1778491885889},"latestVersion":{"version":"1.0.0","createdAt":1773414519684,"changelog":"Initial release of the plydb skill:\n\n- Provides instructions for using the PlyDB CLI to query across heterogeneous data sources with SQL.\n- Supports Postgres, MySQL, CSV, Parquet, JSON, Excel, SQLite, DuckDB, and Google Sheets.\n- Details configuration of data sources and usage of fully qualified table names.\n- Explains how to generate and enrich semantic context using database comments and overlays adhering to the Open Semantic Interchange (OSI) specification.\n- Includes troubleshooting guidance for Google Sheets data source with interactive OAuth.","license":"MIT-0"},"metadata":null,"owner":{"handle":"ypt","userId":"s174rtqktex0ma71f1hx9amqkx885g7e","displayName":"ypt","image":"https://avatars.githubusercontent.com/u/2073178?v=4"},"moderation":{"isSuspicious":false,"isMalwareBlocked":false,"verdict":"clean","reasonCodes":["review.llm_review"],"summary":"Review: review.llm_review","engineVersion":"v2.4.24","updatedAt":1780089870213}}