In today’s article, we’re going to find and remove duplicate files with rdfind. We’ll try to make this as safe as possible. I’d suggest newer users not actually worry about duplicate files. Allocate enough space to your OS and don’t worry about it. Disk space is cheap these days.
If you’re interested in removing duplicate files, then the rdfind application is one solution you can try. There are others, but we’ll be using rdfind. We may cover other choices in the future.
You don’t have to run rdfind with it automatically deleting the duplicate files and that’s what I’m going to suggest you do – at least at first. It’s good to see what’ll be deleted before it is actually deleted.
If you check the rdfind man page, you’ll see it’s described as:
rdfind – finds duplicate files
It does what it says on the tin. It finds duplicate files. You can run the command in a manner that automatically removes the found duplicates, but that’s not something to take lightly.
Again, and I can’t stress this enough, some duplicates are there for a reason – they belong there. So, don’t run this on the root directory and expect a good outcome. Running this on the root directory and automatically removing duplicates is going to break stuff. Feel free to do so, ’cause it’s your computer. Just don’t blame me when it breaks.
There… I feel you’re safely and properly informed! Let’s get this article started…
We’ll just use the terminal to install rdfind. To open your default terminal emulator, press CTRL + ALT + T and your default terminal should open. You might as well leave it open, as rdfind also runs in the terminal and you’ll need an open terminal in the next step.
sudo apt-get install rdfind
sudo pacman -S rdfind
sudo yum install epel-release sudo yum install rdfind
sudo dnf install rdfind
Now that you have installed rdfind, you should probably consult the man page. That’s an easy command:
man rdfind
With that knowledge fresh in your memory and rdfind installed, we can just jump into the article!
Your terminal should still be open from the previous step. If not, go ahead and open it now. You’ll need a terminal open to find and remove duplicate files with rdfind. It is not a graphical application.
So, I suppose you can start with this command:
rdfind /path/to/directory
That may look dangerous, but it’s not. If you run that command, it simply finds the duplicate files and then creates a text file for you. You then review the text file and manually remove the duplicate files. This is probably for the best. It’s also the same thing if you do a dry run, like so:
rdfind -dryrun true /path/to/directory
You can actually delete the files and replace the first one found with hard links. While not recommended by me, it’s at least safer.
rdfind -makehardlinks true /path/to/directory
Finally, you can just go right ahead and just find and remove duplicate files! This is safer if you have both a recent backup and you’ve gone ahead and run one of the first two commands. Then, if you have run one of those two rdfind commands, you’ll know what’s going to be deleted.
rdfind -deleteduplicates true /path/to/directory
Just don’t run rdfind on your root directory, and probably don’t run it directly on your home directory, and you should be more or less okay. Feel free to run it on your Downloads
folder, on your Documents
folder, or even your Pictures
folder.
Running rdfind that way, on those types of directories, will be fine and at least should not break things. Rdfind pretty good at finding just duplicates, or I’d not recommend it. Be sure to backup first and make sure you give it a dry run before you start automatically removing stuff! Seriously, do not run this on your root directory.
And there you have it… You have yet another article! This time, we’ve learned how to find and remove duplicate files with rdfind. You were given a clear warning, but you’re gonna do what you’re gonna do. Man, I really need to write that article about backing up properly!
Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your own site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.
Today we'll cover one way to enable or disable your network interface in the Linux…
Today's exercise is a nice and simple exercise where we check your NIC speed in…
Have you ever wanted to easily monitor your wireless connection? Well, now you can learn…
I think I've covered this before with the ls command but this time we'll count…
Today we'll be learning about a basic Linux command that's known as 'uname' and it…
If you've used hardinfo in the past, it may interest you to know that hardinfo…
View Comments
I was wondering about this just the other day. Thanks.
LOL Glad ya liked it. I caution you to NOT run it on / or ~/! See my post on linux.org where I show what happens if you run it automatically on your root directory.
Suffice to say, running it on the root directory did not end well. The system refused to boot after. That was the expected behavior. You have duplicate files for a reason.