Data Classification
PassAudited by ClawScan on May 10, 2026.
Overview
This appears to be a purpose-aligned local data-classification skill, with only transparency notes around running its Python helper and saving CSV output for larger schemas.
This skill looks reasonable for classifying field names and SQL/DDL schemas. Before using it, be comfortable with a local Python helper reading the schema you provide and with CSV files being generated for larger outputs. Avoid supplying real data rows or secrets when schema metadata is enough, and delete generated CSVs if they contain sensitive schema details.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
The agent may execute the included helper script and read the SQL/DDL file path provided for classification.
The skill may run a bundled local Python helper against a user-specified SQL file. This is disclosed and central to the classification purpose, but users should know local code execution is part of the workflow.
python3 skills/data-classification/scripts/classify_data.py --sql path/to/schema.sql --mode finance --format markdown
Use it only with schema files you intend to classify, and avoid including real secrets or unnecessary production data in DDL comments.
Large classification jobs can leave a downloadable CSV on disk containing the classified field list and related metadata.
The skill directs the agent to create and attach a local CSV for larger schemas. This is purpose-aligned, but it means schema/classification output may persist as a local file and reveal an absolute filesystem path.
>20 fields: create a complete CSV result file ... attach the CSV with `MEDIA:<absolute-csv-path>`
Review or delete generated CSV files if the schema is sensitive, and ask the agent to use a specific export location if needed.
Users may not immediately know why a file was created instead of receiving all rows inline.
The instruction asks the agent not to proactively disclose the threshold used to choose inline versus CSV output. It does not hide an external transfer or destructive action, but it reduces transparency.
Choose output delivery by field count internally, but do not explain this threshold policy to the user
If transparency matters, ask the agent to explain its output-routing choice or to follow your preferred output format.
