Find And Remove Duplicate Files With rdfind

In today’s article, we’re going to find and remove duplicate files with rdfind. We’ll try to make this as safe as possible. I’d suggest newer  users not actually worry about duplicate files. Allocate enough space to your OS and don’t worry about it. Disk space is cheap these days.

Warning: Blindly removing duplicate files can be a risky operation. It can break things. You have been warned. Exercise caution!

If you’re interested in removing duplicate files, then the rdfind application is one solution you can try. There are others, but we’ll be using rdfind. We may cover other choices in the future.

You don’t have to run rdfind with it automatically deleting the duplicate files and that’s what I’m going to suggest you do – at least at first. It’s good to see what’ll be deleted before it is actually deleted.

If you check the rdfind man page, you’ll see it’s described as:

rdfind – finds duplicate files

It does what it says on the tin. It finds duplicate files. You can run the command in a manner that automatically removes the found duplicates, but that’s not something to take lightly.

Again, and I can’t stress this enough, some duplicates are there for a reason – they belong there. So, don’t run this on the root directory and expect a good outcome. Running this on the root directory and automatically removing duplicates is going to break stuff. Feel free to do so, ’cause it’s your computer. Just don’t blame me when it breaks.

There… I feel you’re safely and properly informed! Let’s get this article started…

Install rdfind:

We’ll just use the terminal to install rdfind. To open your default terminal emulator, press CTRL + ALT + T and your default terminal should open. You might as well leave it open, as rdfind also runs in the terminal and you’ll need an open terminal in the next step.

Debian/Ubuntu:
Arch/Derivatives:
RHEL/CentOS:
Fedora/Derivatives:

Now that you have installed rdfind, you should probably consult the man page. That’s an easy command:

With that knowledge fresh in your memory and rdfind installed, we can just jump into the article!

Find And Remove Duplicate Files With rdfind:

Your terminal should still be open from the previous step. If not, go ahead and open it now. You’ll need a terminal open to find and remove duplicate files with rdfind. It is not a graphical application.

So, I suppose you can start with this command:

That may look dangerous, but it’s not. If you run that command, it simply finds the duplicate files and then creates a text file for you. You then review the text file and manually remove the duplicate files. This is probably for the best. It’s also the same thing if you do a dry run, like so:

You can actually delete the files and replace the first one found with hard links. While not recommended by me, it’s at least safer.

Finally, you can just go right ahead and just find and remove duplicate files! This is safer if you have both a recent backup and you’ve gone ahead and run one of the first two commands. Then, if you have run one of those two rdfind commands, you’ll know what’s going to be deleted.

Just don’t run rdfind on your root directory, and probably don’t run it directly on your home directory, and you should be more or less okay. Feel free to run it on your Downloads folder, on your Documents folder, or even your Pictures folder.

Running rdfind that way, on those types of directories, will be fine and at least should not break things. Rdfind pretty good at finding just duplicates, or I’d not recommend it. Be sure to backup first and make sure you give it a dry run before you start automatically removing stuff! Seriously, do not run this on your root directory.

Closure:

And there you have it… You have yet another article! This time, we’ve learned how to find and remove duplicate files with rdfind. You were given a clear warning, but you’re gonna do what you’re gonna do. Man, I really need to write that article about backing up properly!

Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your own site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.

Find Multiple Filenames By Extension – With Locate

In today’s article, we’re going to explore another way to find multiple filenames by extension – with locate. It’s a handy skill to have and will see you installing ‘mlocate’ to get access to the ‘locate’ command. It shouldn’t be a difficult or even very long article.

If this seems really familiar, then you’re paying attention. After all, it was just a couple of days ago that you saw this article:

How To: Find Multiple Filenames By Extension

So, why are we covering the same topic? Well, WordPress, for legitimate security concerns, likes to eat the slash. (It’s a slash and a backslash. There’s no ‘forward slash’ if you want to be *technically* correct.)

Slashes are understood programmatically, by many programs – including PHP. So, in theory it’d be possible to at least probe for exploits with an unescaped slash. The solution is sort of to escape the slash by including two of them, but then WordPress eats that escaping slash every time you save a draft and add to it!

This is extremely frustrating as an author. It seriously sucks. It’s something I’ll need to keep in mind for future articles, always wary of the dastardly slash! At least now I know…

Well, that hassle of escaping the slash also reminded me that we can accomplish the same thing without any slashes, just by using the ‘locate’ command. With the previous article still fresh in my memory, I figured I might as well write the same article – but with a different tool. Why not?!?

Install mlocate:

This article requires an open terminal, like many other articles on this site. If you don’t know how to open the terminal, you can do so with your keyboard – just press CTRL + ALT + T and your default terminal should open.

The ‘locate’ command is actually a part of the mlocate package. It’s not always installed by default, but it should be in every default repository out there. It should be easy enough for you to install. 

For the record, the ‘locate’ command describes itself like:

locate – find files by name

Well, that description looks promising – and is exactly what we’re hoping to accomplish! So then, go ahead and install it. You can install it just like you’d install any other software. In the terminal, it’d look something like:

Install mlocate In RHEL/CentOS:
Install mlocate In Debian/Ubuntu:

That’ll work for most distros, assuming you’re using those package managers. If you’re using a different distro, just go ahead and try the same command but adjusted for your package management software. You should be able to find and install it easily.

NOTE: You’re not done yet. The locate command works off of a database. It’s really quick to generate it and it will use a cron job to keep itself updated after that. So, to get the database started, you’ll want to use this command:

With that done, you’re good to go to the next step…

Find Multiple Filenames By Extension With Locate:

Don’t close your terminal from the previous step! Like oh so many articles, this one also requires an open terminal. So, with your terminal still open, you can start to find filenames by extension with locate. For example:

That will find filenames by extension (with ‘locate’) in the current directory. If you want to specify more filenames, it’s really simple:

You can find just one file by extension:

Or you can find a few files by extension:

The sky’s the proverbial limit and the syntax is so much easier. It’s my understanding that the ‘locate’ command is faster because it relies on a database. I ran a couple of tests, using the article about how to time a command and the results weren’t really conclusive – but I only tested with very simple operations. So, your mileage may vary. Feel free to test it and let me know your results!

Closure:

Well, there’s another article. This time, you’ve learned how to find multiple filenames by extension with ‘locate’, and seen that ‘locate’ is a handy command with easier syntax. So, if you’re interested in the ‘locate’ command, be sure to check the man page (man locate). There are many folks who seem to prefer the ‘locate’ command in general, so it seemed like a good article to include.

Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your own site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.

How To: Find Multiple Filenames By Extension

Today’s article will show you how to find multiple filenames by extension, using the find command in the terminal. It’s a pretty handy skill to have for when you need to know where files of a certain extension reside on your file system.

If you got a new article notification yesterday, that’s because I’m an idiot. Instead of hitting the schedule button, I hit the publish button. I’m not sure what I was thinking. It was fairly early in the afternoon and I wasn’t even sipping wine at the time! Sorry for disturbing you unnecessarily. I almost sent out an ‘oops’ newsletter, but then I’d have just disturbed you twice.

Anyhow, this will be another article that makes use of the find command. The find command is a rather robust command and can be somewhat daunting for new people. I feel more comfortable writing articles that let you learn it in chunks, rather than trying to cover the entire thing. I do find it hard to explain, but I’ll do my best.

What’s this useful for? Well, let’s say you want to find .deb, .zip, and .iso files in you ~/Downloads directory. That’s what this command is going to do for you. You can find multiple filenames by extension in the terminal and it’s not overly complex once you understand the basics of the command.

Instead of making the intro needlessly longer, and to make up for today’s scheduling gaff, I’ll keep the intro short and we’ll just run straight into the article…

Find Multiple Filenames By Extension:

In the intro, I mentioned that this was going to be done in the terminal. As such, we’re obviously going to need an open terminal for this exercise. To do so, press CTRL + ALT + T and your default terminal should open. Tada!

Warning: I do not explain this one as well as I’d hoped. So, I tried to explain by way of demonstrating. I’m hopeful that works.

Now, here’s the command I just ran in my terminal:

Now, if you want to run it in the current directory, you can specify the directory or you can change ~/Downloads to a . (period).

If you want to find just one file, you’d stop after "*.deb" and leave the closing \).  If you want to add additional files, you would include -o -name "*.<extension>" and make sure to keep the closing ).

It might be easier to show you. For formatting reasons, I’ll use the . (period) instead of specifying a directory. It’ll fit on your screen better than a longer command. So, “How To:”…

Find One File By Extension:
Find Two Files By Extension:
Find Three Files By Extension:

So, hopefully you can see how this find command works. I can’t think of a better way to explain the command than to show it to you in examples. I hope that works for people. Feel free to comment in either direction, as I think it might work for some but be less effective for others.

In theory, you could find all sorts of files by extension, just remember to include the -o -name and file type and noting that the asterisk is a wildcard in this instance, meaning all files with that extension will be found. So, .gz files would be "*.gz". You can make the command as long as your heart desires!

Well, no… There’s bound to be an upper limit somewhere. (Wait, I looked it up, the maximum number of characters in the terminal is 4096 characters. And now we know…)

EDIT: You have no idea how much of a pain in the butt this article turned out to be. Holy crap. For safety reasons, WordPress eats the backslash \. I did not know this. Nobody knows this. The solution is to escape the backslash by using it twice. This article is full of backslashes. I think I got them all. It eats them every time I save the draft, so hopefully they show up in publication. I can never edit this article again, so it is what it is. Well, I could edit it again, but it’d be a pain in the butt.

Closure:

So, yeah… Today we’ve learned to find multiple filenames by extension. At least I hope we have. It’s not so easy to explain, but I figured if I explained it by showing examples then you’d be able to pick it up in context. If you do have any questions, just drop ’em into the comment box below and I’m usually pretty speedy at getting back to people. As always, the man page is probably helpful.

Again, sorry about the fake article notification. That doesn’t happen often, but it does sometimes happen. In an ideal world, I’d have an awesome editor and I would just save everything as a draft. If you’re interested in volunteering for that role, let me know! It’d make my life so much easier, I think… I mean, I don’t really know… It just seems like something that’d help.

Also, I’m pretty excited to write this month’s meta article. I’ll probably wait and schedule it for the holiday or a weekend day. They’re not important articles, but I find it interesting. The site’s growing steadily.

Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your own site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.

How To: Find Files Owned By A Specific User

In today’s article, we’ll be learning how to find files owned by a specific user. We’ll even use the ‘find’ command, as we find files owned by a specific user! That seems to be the best idea, and the best way to do it.

This should also be a fairly quick article. I don’t see any reason why I’d have to make it longer than it needs to be. So, it won’t take too much of your time today.

This article will be published on November 11. That’s a day known by a number of other names. It’s Veteran’s Day, Armistice Day, Remembrance Day, and probably a dozen more names that I don’t know. It was the day WWI ended, which was thought to be the war to end all wars ’cause it was just that horrific.

Well, as you can see, it was definitely not the last war – but we still choose this day to remember. In the US, veterans are celebrated today. Memorial Day is only for those who are no longer with us. Today is for the vets, as well as those who are no longer here.

It’s a holiday, which means it’s a fine day to have a nice and simple article. It’s a fine day to cherish your friends and family, instead of spending your time online reading Linux articles. (But thanks for doing so!)

Find Files Owned By A Specific User:

This article requires an open terminal, like so very many other articles. If you don’t know how to open the terminal, you can do so with your keyboard, just press CTRL + ALT + T and your default terminal should open.

With your terminal now open, you should probably navigate to a directory other than your home directory. If you run this command in your home directory, it’s gonna output a whole lot of text. So, let’s just try this first in your ~/Downloads directory:

The command we’re going to use is find, and the format is the find command, a dot to say the current directory, the -user flag, and then the username. So, your command would look something like:

(No brackets, of course.)

You probably don’t have any files owned by root in that directory, so a good test to make sure it’s working properly would be something like:

Now, you can mix things up a bit. Instead of using the dot to indicate the current directory, let’s find files owned by root in the /etc directory.

See? That’s not all that hard at all. It’s remarkably easy to find files owned by a specific user – and the command really isn’t that hard to memorize! You can run it in the folder you’re in, or you can use the directory path method.

Closure:

I told you that it’d be a quick article today. It’s a good day for just a quick tip and everyone can benefit from knowing how to find files owned by a specific user. Toss this tip into your growing list of tools in your Linux toolbox, because  you never know when this will come in handy.

Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your own site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.

How To: Check If A Specific Port Is Open

In today’s article, we’re going to learn how to check if a specific port is open. The command is simple, but versatile. It’s also pretty quick to check and see if a port is open. Read on, as I try to make it easy!

From a security standpoint, it’s a good idea to identify what ports are open, and what function those open ports have. From a usability standpoint, it’s good to know which ports are open so that you can connect to the device.

I suppose, as a general rule, you could probably assign ports to do all sorts of things. However, it’s actually standardized (in many cases) and specific ports will be open for specific things. 

You may have found yourself using different ports. If you owned a website, then the address to your control panel might be something like https://example.com:9000 or similar. Your server will have open ports for other things, like port 80 for HTTP or port 22 for SSH.

If you are curious, you can click this link to learn more about standard ports. If you’re new to the concept, then that link might actually help explain things better than I can. We do rely on standardized port numbers quite a bit.

When you’re browsing the regular web, you’re not necessarily aware but you’re using the site’s port 80 to get the public-facing web data. While you could host your site on a different port, it’d take some configuration changes on the back end. I suppose you could just do some work with htaccess if that was your goal, but it’s a pretty pointless goal.

Check If A Specific Port Is Open:

You should think of open ports as public information! They’re not secrets. It’s easy to find open ports, so you’ll need to secure them properly. It’s a good idea to know what traffic is happening on what port, as ports are open for a reason.

For example, there’s no security lost if I point out this site’s running on a server with port 80 open. Of course it’s open. That’s how you browse it. There are ways to hide your open ports, but that’s beyond the scope of this article. This article is just going to show you how to check if a specific port is open.

Like much of the time, you will need to have an open terminal. Of course, if you want to open your default terminal emulator, just press CTRL + ALT + T and your default terminal should open.

With your terminal open, you can just use ‘localhost‘ to test your own computer. For example, you might have an open port 80 or port 22 for SSH. So, to check those, your commands would look like:

You can also check remote servers. You can even check those on your network by using their IP address or their hostname. You’re familiar with my usage of ‘kgiii-msi.local‘, so we’ll use that.

That also works with this site and other sites on the internet. Just use the fully qualified domain name (FQDN) instead of an IP address. As an example, try the following command:

See? You have now confirmed that this site has port 80 open. Congratulations, you’re a 1337 h4X0R! But, now you can check if a specific port is open, a valuable skill to have. 

By the way, if the command appears to stop and not give you a result, press CTRL + C to halt the process. If the port isn’t open, and the server doesn’t respond to tell you that it’s closed, the command will keep running indefinitely. So, it’s good to know how to stop it.

Closure:

Again, this article has shown you how to check if a specific port is open. An open port doesn’t mean anything bad, necessarily. The command you’re using in this article will also try to tell you what traffic is expected on the open port. It looks a little something like this:

linux-tips.us has an open port 80 - just like every other site on the planet...
Oh no! Linux-Tips.us has an open port! (It’s fine. It’s how you’re seeing the site!)

Of course, that’s just http traffic, so try running the following command to see where you’re getting the https (secure) content:

That’ll show you that not only is the port open, but that that port (443) is used for https traffic, which is really what the site uses for you the reader. I obviously have https configured, updated, and properly implemented. I want you to have a secure connection, as secure a connection as you want.

Anyhow, this is getting to be a long postscript… This is turning into a fairly long article. We’ll see how many people read past the ‘CLOSURE:’ text! 

Think of ports like doors. Just because it’s open doesn’t mean you can go in and help yourself. Also, it’s not polite to go around knocking on random doors just to see if they’re open. Feel free to check this site, as I’m confident about the security.

Well, I hope you’ve learned how to check if a specific port is open. This seemed like a good thing to share. It’s also useful if you want to SSH into a remote computer and need to make sure the port is open as one of your debugging steps! (See?!? All the articles can be SSH articles!)

Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your own site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.

Subscribe To Our Newsletter
Get notified when new articles are published! It's free and I won't send you any spam.
Linux Tips
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.