Install
openclaw skills install file-deduplicatorFind and remove duplicate files intelligently. Save storage space, keep your system clean. Perfect for digital hoarders and document management.
openclaw skills install file-deduplicatorVernox Utility Skill - Clean up your digital hoard.
File-Deduplicator is an intelligent file duplicate finder and remover. Uses content hashing to identify identical files across directories, then provides options to remove duplicates safely.
clawhub install file-deduplicator
const result = await findDuplicates({
directories: ['./documents', './downloads', './projects'],
options: {
method: 'content', // content-based comparison
includeSubdirs: true
}
});
console.log(`Found ${result.duplicateCount} duplicate groups`);
console.log(`Potential space savings: ${result.spaceSaved}`);
const result = await removeDuplicates({
directories: ['./documents', './downloads'],
options: {
method: 'content',
keep: 'newest', // keep newest, delete oldest
action: 'delete', // or 'move' to archive
autoConfirm: false // show confirmation for each
}
});
console.log(`Removed ${result.filesRemoved} duplicates`);
console.log(`Space saved: ${result.spaceSaved}`);
const result = await removeDuplicates({
directories: ['./documents', './downloads'],
options: {
method: 'content',
keep: 'newest',
action: 'delete',
dryRun: true // Preview without actual deletion
}
});
console.log('Would remove:');
result.duplicates.forEach((dup, i) => {
console.log(`${i+1}. ${dup.file}`);
});
findDuplicatesFind duplicate files across directories.
Parameters:
directories (array|string, required): Directory paths to scanoptions (object, optional):
method (string): 'content' | 'size' | 'name' - comparison methodincludeSubdirs (boolean): Scan recursively (default: true)minSize (number): Minimum size in bytes (default: 0)maxSize (number): Maximum size in bytes (default: 0)excludePatterns (array): Glob patterns to exclude (default: ['.git', 'node_modules'])whitelist (array): Directories to never scan (default: [])Returns:
duplicates (array): Array of duplicate groups
duplicateCount (number): Number of duplicate groups foundtotalFiles (number): Total files scannedscanDuration (number): Time taken to scan (ms)spaceWasted (number): Total bytes wasted by duplicatesspaceSaved (number): Potential savings if duplicates removedremoveDuplicatesRemove duplicate files based on findings.
Parameters:
directories (array|string, required): Same as findDuplicatesoptions (object, optional):
keep (string): 'newest' | 'oldest' | 'smallest' | 'largest' - which to keepaction (string): 'delete' | 'move' | 'archive'archivePath (string): Where to move files when action='move'dryRun (boolean): Preview without actual actionautoConfirm (boolean): Auto-confirm deletionssizeThreshold (number): Don't remove files larger than thisReturns:
filesRemoved (number): Number of files removed/movedspaceSaved (number): Bytes savedgroupsProcessed (number): Number of duplicate groups handledlogPath (string): Path to action logerrors (array): Any errors encounteredanalyzeDirectoryAnalyze a single directory for duplicates.
Parameters:
directory (string, required): Path to directoryoptions (object, optional): Same as findDuplicates optionsReturns:
fileCount (number): Total files in directorytotalSize (number): Total bytes in directoryduplicateSize (number): Bytes in duplicate filesduplicateRatio (number): Percentage of files that are duplicatesconfig.json:{
"detection": {
"defaultMethod": "content",
"sizeTolerancePercent": 0, // exact match only
"nameSimilarity": 0.7, // 0-1, lower = more similar
"includeSubdirs": true
},
"removal": {
"defaultAction": "delete",
"defaultKeep": "newest",
"archivePath": "./archive",
"sizeThreshold": 10485760, // 10MB threshold
"autoConfirm": false,
"dryRunDefault": false
},
"exclude": {
"patterns": [".git", "node_modules", ".vscode", ".idea"],
"whitelist": ["important", "work", "projects"]
}
}
const result = await findDuplicates({
directories: '~/Documents',
options: {
method: 'content',
includeSubdirs: true
}
});
console.log(`Found ${result.duplicateCount} duplicate sets`);
result.duplicates.slice(0, 5).forEach((set, i) => {
console.log(`Set ${i+1}: ${set.files.length} files`);
console.log(` Total size: ${set.totalSize} bytes`);
});
const result = await removeDuplicates({
directories: '~/Documents',
options: {
keep: 'newest',
action: 'delete'
}
});
console.log(`Removed ${result.filesRemoved} files`);
console.log(`Saved ${result.spaceSaved} bytes`);
const result = await removeDuplicates({
directories: '~/Downloads',
options: {
keep: 'newest',
action: 'move',
archivePath: '~/Documents/Archive'
}
});
console.log(`Archived ${result.filesRemoved} files`);
console.log(`Safe in: ~/Documents/Archive`);
const result = await removeDuplicates({
directories: '~/Documents',
options: {
dryRun: true // Just show what would happen
}
});
console.log('=== Dry Run Preview ===');
result.duplicates.forEach((set, i) => {
console.log(`Would delete: ${set.toDelete.join(', ')}`);
});
Won't remove files larger than configurable threshold (default: 10MB). Prevents accidental deletion of important large files.
Move files to archive directory instead of deleting. No data loss, full recoverability.
All deletions/moves are logged to file for recovery and audit.
Log file can be used to restore accidentally deleted files (limited undo window).
MIT
Find duplicates. Save space. Keep your system clean. 🔮