Dataset Splitter
Security checks across static analysis, malware telemetry, and agentic risk
Overview
This appears to be a straightforward local dataset-splitting tool, but it moves files by default and users should review the annotation and install behavior before running it.
Before installing or running it, confirm you are pointing it at the intended dataset folder. Use --copy if you want a non-destructive split, use --yolo if you expect labels to be split by the script, and install Pillow in an isolated environment if you use the stats feature.
Static analysis
No static analysis findings were reported for this release.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Running the default split can reorganize or remove images from the original folder, which may surprise users expecting a non-destructive copy.
The default behavior without --copy moves user-selected image files into split directories, which is purpose-aligned but mutates the source dataset.
if args.copy:
shutil.copy2(img_path, dest_path)
else:
shutil.move(img_path, dest_path)Use --copy or back up the dataset when you want to preserve the original directory unchanged.
If users run the documented non-YOLO annotation example, images may be moved while labels remain in the original annotation folder.
Annotations are processed only when a destination annotation directory exists, and those directories are only created for --yolo output, while SKILL.md presents --annotations as generally splitting annotations together.
train_ann = os.path.join(output_dir, "train", "labels") if args.yolo and ann_dir else None ... if args.annotations and src_ann_dir and dest_ann_dir:
Use --yolo when you need labels split by this script, or verify the output carefully before deleting or relying on the original dataset layout.
Installing an unpinned package can produce different versions over time or inherit normal package-supply-chain risk.
The skill asks users to install an external Python package without a pinned version; this is expected for image statistics but is not captured by an install spec.
pip install pillow
Install Pillow in a virtual environment and consider pinning a trusted version if reproducibility matters.
