Missing User Warnings
Medium
- Confidence
- 95% confidence
- Finding
- The dataset-cleaning step explicitly recommends commands that remove corrupted and duplicate images and write cleaned output, but it provides no warning that data may be deleted, excluded, or irreversibly altered if the script operates in place or is misconfigured. In a production ML workflow, accidental removal of source images can cause data loss, break reproducibility, and silently bias training datasets.
