Save A Web Page As Text

If you’re at all like me, you document all sorts of things and you too might find it handy to know how to save a web page as text. It’s not a complicated task; you can do it in the terminal easily enough. So, if you want to save a web page as text, read on! 

This intro should be rather short. Imagine that!

I don’t have to explain what a web page is. It’s a page (just a page) on a website.

I don’t have to explain what text means. We’ll just be using .txt files.

While this isn’t something I’ve bothered with in a long time, you might find it interesting and helpful. If you’re into keeping notes of things you want to learn more about and remember, you may find saving a web page as text worthwhile.

You can organize the text files however you want and one of the best benefits is that you can perform searches on your local documents easily enough. This might be something that interests you, especially if you’re new and browsing around the web looking for things to learn.

We’ll only be using a couple of tools. We will be using the terminal.

curl:

The first application you’ll need to save a web page as a text file will be the curl application. The curl application is used to transfer a URL. A curl command downloads a file and shows it in your standard output.

If you check the man page, you’ll see:

curl – transfer a URL

See? Exactly as I had said. It’s the correct tool for the job. 

You can also see this article about curl:

Let’s Have a Limited Look at Linux’s cURL Application

html2text:

This should be obvious by the title. It should be made further obvious by the title of this article. This is an application that turns HTML (Hypertext Markup Language – what is used on web pages more often than not) into plain text.

If you check the man page, you’ll see:

html2text – an advanced HTML-to-text converter

Once again, a fine application for the task at hand. You’ll see!

Save A Web Page As Text:

As mentioned above, this is a terminal-based operation. We’re going to save a web page as text, but we’re going to do it in the Linux terminal. More often than not, a terminal can be opened by pressing CTRL + ALT + T on your keyboard.

I’ll give installation instructions for the apt-using distros out there. These packages will be available in your package manager if you’re using any of the major distros. Just adjust these commands to match your needs.

curl:
html2text:

We’re interested only in the -o (output) flag for this application of html2text.

The Process:

The syntax to save a web page as text is simple. It looks like this:

Simply, we’re using the curl application to grab the data, we then send that data through the pipe command where it’s processed by the html2text application.

An example would look like this:

You can, of course, save individual pages as text. Here’s an example:

The terminal output is interesting:

Then, you can use a plain text editor to read (and edit) the text file. You can view it in the terminal with just the cat command. That’d look like this:

Though, it’s probably easier to read the saved file with a decent plain text editor that has a GUI. There’s an abundance of text editors available for Linux, so pick your favorite and use that to read the saved output.

Closure:

Well, if you have ever wanted to save a web page as text, you now know how to do that. This was an article that came not from my notes but from my memory. I used to do this with some regularity but I’ve stopped doing so as of late. I haven’t kept so many new notes lately, though I’m not sure why not.

Anyhow, this is a nice and simple exercise that anyone should be able to follow. If you’re using a different package manager it may take a bit more effort, but it’s not complicated. The packages should be available in all the major distros, or something similar. The curl application will certainly be available and might even be installed by default.

Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.

Mastering the Linux Terminal Pipe Command

Well, if it’s not obvious by the title, it soon will be obvious that I’ve once again leaned on AI to write an article, this time about the pipe command. I decided to stick (mostly) to the title AI gave this article, but it was longer than it should be.

AI tried to title this:

“Mastering the Linux Terminal Pipe Command: A Comprehensive Guide”

Anyhow, this is one of those articles that I just can’t write. No matter what I write, it will not be adequate – even though the pipe is a simple enough concept. Much like a recent grep article, this is just one of those articles I won’t write well.

Also, I’m not sure that I should call it a command. It’s more an operator than a command, but the references I see refer to it as a command more frequently than as an operator. Perhaps the word would be ‘operand’? But, for convenience and convention’s sake, I will call it the pipe command.

No, this isn’t something you install. This is a command that you use with other commands. It’s a lot like the operators I’ve already written about. If you’re unfamiliar with the concept, read this article:

How To: Write Text To A File From The Terminal with “>” and “>>”

The short of it is that the pipe takes the output from one command and inserts it into another command. This lets you take the output from one command and parse it with another command. That’s all there is to it – which is why I’m unable to write this article.

Which is why I leaned on my good buddy AI for this article…

Mastering the Linux Terminal Pipe Command:

In the world of Linux, the terminal pipe command stands as a quintessential tool, offering a powerful and flexible means to manipulate data streams. Understanding and mastering the pipe command can significantly enhance your efficiency and productivity in the Linux environment. In this comprehensive guide, we delve into the intricacies of the pipe command, exploring its functionalities, use cases, and advanced techniques.

What is the Pipe Command?

At its core, the pipe command, represented by the symbol |, allows you to redirect the output of one command as input to another command. This seamless connection between commands enables the creation of complex data processing pipelines, facilitating the manipulation and transformation of data with remarkable ease.

Basic Usage:

The basic syntax of the pipe command is straightforward:

Here, the output generated by command1 is passed as input to command2. This chaining of commands enables the execution of multiple operations in a single line, streamlining workflows and reducing the need for intermediate files.

Practical Examples:

Let’s explore some practical examples to illustrate the utility of the pipe command:

Counting Words in a File:

This command displays the number of words in the file file.txt. The cat command outputs the contents of the file, which are then piped to wc -w, which counts the words.

Searching for a Pattern:

Here, grep is used to search for the specified pattern in the file file.txt. The output, which consists of lines containing the pattern, is then piped to wc -l, which counts the number of matching lines.

Sorting Data:

This command sorts the lines in the file file.txt alphabetically and removes duplicate lines using the uniq command.

Advanced Techniques:

While the basic usage of the pipe command is invaluable, mastering advanced techniques can unlock its full potential:

Chaining Multiple Commands:

You can chain multiple commands together to create sophisticated data processing pipelines. Each command in the pipeline operates on the output of the preceding command, enabling complex transformations with minimal effort.

Combining Filters:

Filters such as grep, sed, and awk can be combined to perform intricate text-processing tasks. By leveraging the unique capabilities of each filter, you can manipulate data in a myriad of ways, ranging from pattern matching to text substitution.

Redirecting Output:

In addition to chaining commands, you can redirect the output of a pipeline to a file using the > operator. This allows you to capture the results of your data processing pipeline for future reference or analysis.

Common Pitfalls:

While the pipe command offers immense flexibility, it’s essential to be mindful of potential pitfalls:

  1. Order of Operations:

    The order in which commands are chained together matters. Ensure that you sequence commands appropriately to achieve the desired outcome. A misplaced command in the pipeline can lead to unexpected results or errors.

  2. Input Compatibility:

    Ensure that the output of one command is compatible with the input requirements of the next command in the pipeline. Mismatched data formats or incompatible options can result in errors or unintended behavior.

  3. Performance Considerations:

    Complex pipelines involving multiple commands may incur performance overhead. Be mindful of resource utilization, especially when working with large datasets or computationally intensive operations.

Addendum:

The Linux terminal pipe command is a versatile tool that empowers users to streamline data processing tasks effectively. By mastering its usage and exploring advanced techniques, you can unlock the full potential of the Linux command-line environment. Whether you’re a novice user or a seasoned Linux enthusiast, the pipe command remains an indispensable asset in your toolkit, enabling you to easily conquer complex data manipulation challenges.

Closure:

So, that was AI doing my job for me. Like grep, I tried to write an article about pipe, which was a hot mess. I can usually salvage articles and publish something of my work, but I just didn’t do well with a couple of subjects. This is one of them.

The thing is, I refer to the pipe command with some regularity. I don’t have an article about pipes, so I can’t link to that. This leaves the reader with a search engine and I’d rather they have an excuse to open an additional link. It’s not just good SEO, it’s good hospitality. I’ll never explain everything, but I can explain some things and people won’t need to leave the site to learn those things.

Also, even AI had issues with this article. I told it to write 1200 words and it came up with maybe 600 words. I applaud those who can turn the pipe command into more than a blurb with a few examples that help people grasp the concept. Seriously, hats off to them. I don’t write nearly as well as my volume of articles would imply.

I don’t think I’ll need to use AI for any near-future articles. I’m doing two of them fairly close together because they’re things I feel need to be done. They are articles that need to be written. It is information that needs to be on the site. I did separate the two AI-written articles by some time, just to give folks a break between them. I know, they’re not preferred and they surely don’t match my writing style.

Thanks for indulging me, if nothing else. Amusingly, this isn’t much of a time-saver. The way ChatGPT formats stuff is not compatible with the editor used by my instance of WordPress. I spend a lot of time just formatting things.

Speaking of time invested…

Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.

Avoid Storing Commonly Used Commands In Your Bash History

This article won’t need to be all that long but it might be complicated as we discuss how to avoid storing commonly used commands in your Bash history. Yes, it’s a long title. 

This is also a bit contrary. It is one of those things that is easier done than said. It’s a very wordy thing, after all. I’ll do my best to describe what’s going on and why you might want to do this.

In this case, Bash stands for Bourne Again Shell. This article only applies to those who are using Bash. Bash is not the only shell available and people may opt to use other shells. If you’re one of those people, I don’t think this is going to work for you.

When you’re using the terminal, you’re using Bash. The commands you enter into the terminal are stored in ~/.bash_history, a hidden file in your home directory. We’ve discussed some of this before.

How To: Have Infinite Bash History
Playing With Your Bash History
How To: Not Save A Command To Bash History
How To: Reload Your .bash_profile

Well, you may type common commands, such as uptime. You may not want to store that command in your Bash history. Do you want to store every time you’ve typed the ls command?

You don’t have to. You have options!

What can you do? Well, you can tell Bash not to store certain commands in the ~/.bash_history file. This is actually a simple operation. To avoid storing commonly used commands in your Bash history, you need only to edit your ~/.bashrc file. I’ll show you how!

Man, this is going to impact the layout…

How To Avoid Storing Commonly Used Commands In Your Bash History:

Yeah, no amount of formatting is going to make that look good.

Anyhow, if this isn’t obvious, you’re going to learn how to do this in the terminal. You could edit your ~/.bashrc file with your favorite GUI editor but we’ll be doing this entire thing in the terminal.

As such, you should have an open terminal. More often than not, you can open your terminal by pressing CTRL + ALT + T. If that doesn’t work, you can find a shortcut to your terminal in your application menu. Should that not work, you’re probably already in the terminal!

So, first, we need to use Nano to edit the ~/.bashrc file. That’s an easy command:

Use your arrow button to navigate to the bottom of that file. Go to the absolute bottom and press enter to start a new line. You can press that button twice to provide some separation and to make it easier to read.

Now, let’s say we don’t want to store the ls, uptime, or touch commands in your Bash history file. We’ll use those as our examples. You should also probably leave a comment in your ~/.bashrc file so that you can easily identify what the code does and remember why you added it. That’s also useful if there are other users.

So, add the following lines:

Next, save that file. As we’re using Nano, you save the file by pressing CTRL + X, then Y, and then ENTER on your keyboard.

Next, you reload your ~/.bashrc file much like you reloaded your Bash profile (which was a link in the intro, should you wish to read it). You reload the ~/.bashrc file with this command:

That should reload the file. If it doesn’t, you can close all your terminal instances and open a new one. If that doesn’t work, you can log out and log back in again.

Anyhow…

Commands starting with :<command>: entries you used will not be stored in the ~/.bash_history file. If you type a command starting with those entries, it will be ignored, meaning they won’t clutter up your ~/.bash_history file with commands you’re already familiar with or commands that don’t need to be stored for things like auditing or security reasons.

It’s pretty simple to do, though it’s a bit of a pain in the butt to explain. This is how you avoid storing commonly used commands in your Bash history – something nobody is going to search for. (If you did find your way here via a search engine, be sure to leave a comment. I want to know who you are!)

Closure:

I realize that this is an awkward article and I’m okay with that. This isn’t something everyone is going to bother with, especially those people who don’t do much in the terminal. Still, it’s possible to avoid storing commonly used commands in your Bash history and now you know how.

Then, someday, someone’s going to search for this exact string of characters and, hopefully, they’ll find this article. I hope this satisfies their curiosity and helps them reach their Linux goals! If you did read this and find it valuable, you can always leave a comment.

Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.

Create A New User

Today’s article is going to be quick and easy as we simply discuss how you create a new user. This is a fairly basic task and shouldn’t take too long to cover. If you want to create a new user, read on!

If it’s not obvious,  you have a user account. You use this information even if you don’t realize it. Indeed, you use this information when you log into your computer to begin with. When you log in, you’re logging into your user account.

There are other users. You may have a root account or an account for MySQL. If you want to know how many different users are on your system, you can follow along with the following article:

How To: List All Users In Linux

One of the things that helps keep Linux secure is that it’s a true multi-user environment. You can only perform operations on the files you have access to. This is why you use sudo or root.

Managing users is a fundamental task in Linux. This article is going to cover how to create a new user and we’ll be doing so in the terminal. This should be fairly universal and you won’t need to install anything as user management tools will be included by default.

We will use a couple of tools, however. The first among them is:

useradd:

The useradd command is basic and, as the name implies, is used to add new users. There’s nothing complicated about it in today’s article and you can be certain that this is already a tool available to you.

If you’re curious about the command, check the man page:

If you do so, you’ll see that it’s described as this:

useradd – create a new user or update default new user information

So, that’s the correct tool for this job.

passwd:

The other tool we’ll be using is the passwd command. You can again tell by the name what the tool is going to do. Simply, it’s used as a password management tool. This too isn’t all that complicated and you can check the man page with this command:

If you do so, you’ll see that I wasn’t kidding and that this tool does what you think it does. It’s described like so:

passwd – change user password

This is the correct tool for the job. After we create a new user, we’ll assign them a password. If the user wishes, they can change that password on their own.

Create A New User:

As mentioned above, we’ll create a new user with terminal-based tools. This is a nice and universal way to do things. Sure, there are GUI tools out there but this is going to work on any Linux system you’re likely to engage with. You can crack open your favorite terminal, often by just pressing CTRL + ALT + T on your keyboard.

First, we’ll create a new user with the useradd command. The syntax is very simple:

For example:

Now, we’ll add a password. This is also a simple command:

For example:

You’ll be asked to enter the password a couple of times. This is to help ensure that you’ve not made any typographical errors while entering the password. It’s all basic stuff.

Next, you can verify that the new user account has been created. For this next step, we’ll simply use cat and grep.

Again, here’s an example:

The output should look a little something like this:

If you find your user, you’ve done this properly and you’ve learned how to create a new user. I told you that it wouldn’t be too complicated!

Closure:

So… This is an article about how to create a new user. It’s a pretty basic task but one you might just want to know about. You never know when you’ll need to create a new user but now you know where to look if you do need to. User management can be a pretty important task, especially for a server admin.

Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.

Let’s Learn About Grep

I’ve used the grep command many times but haven’t written an article to learn about grep. It seems like a good idea to do so so I can reference this article. That’s something that’s considered a good thing.

The thing is, I’ve tried to write this article before and it just came out terrible. That article never got published. It just wasn’t good enough. As you’ve seen the quality of some of these articles, tells you how bad that attempt went. 

You won’t need to install anything for this article. If you’re using a desktop or a server, you have grep. I’d expect to find grep in embedded systems because it’s just a useful tool. I suppose some companies might have ripped it out of their devices to save space and stop you from rooting around and messing with things.

So, I decided I’d do what I’d done before. I reached out to my buddy ChatGPT and asked them to write an article. Sure enough, ChatGPT did a fine job at it. This doesn’t save me much time. I still need to do all the formatting and that takes more time than you might expect.

So then, let’s get into the article…

About Grep:

The grep command in Linux is a powerful tool used for searching and manipulating text within files or standard input streams. It stands for “global regular expression print” and is primarily used to match patterns in text and display the lines that contain those patterns. grep is highly versatile and widely used in various scenarios, ranging from simple text searches to complex pattern matching and filtering tasks.

Basic Syntax:

The basic syntax for grep is as follows:

  • pattern: The pattern to search for. It can be a simple string or a complex regular expression.
  • file: Optional. The file(s) to search. If not specified, grep reads from standard input.
Common Options:
  • -i or --ignore-case: Ignore case distinctions.
  • -v or --invert-match: Invert the sense of matching, displaying non-matching lines.
  • -r or --recursive: Recursively search subdirectories.
  • -n or --line-number: Prefix each line of output with its line number.
  • -l or --files-with-matches: Display only the names of files containing matches.
  • -E or --extended-regexp: Interpret pattern as an extended regular expression (ERE).
  • -F or --fixed-strings: Interpret pattern as a list of fixed strings (not regular expressions).
Examples:

Basic Text Search:

This command searches for the occurrence of “keyword” in the file file.txt and displays all lines containing that keyword.

Case Insensitive Search:

This command performs a case-insensitive search for the pattern “pattern” in the file file.txt.

Invert Match:

This command displays all lines in file.txt that do not contain the word “exclude”.

Search in Multiple Files:

This command searches for the pattern in both file1.txt and file2.txt.

Recursive Search:

This command recursively searches for the pattern in all files within the specified directory and its subdirectories.

Display Line Numbers:

This command displays the line numbers along with the lines containing the pattern in file.txt.

Regular Expressions:

grep supports regular expressions, allowing for more advanced pattern matching. Regular expressions enable users to define complex search patterns, such as matching specific character sequences, ranges, or repetitions.

For example:

  • . matches any single character.
  • ^ matches the beginning of a line.
  • $ matches the end of a line.
  • [ ] specifies a character class.
  • * matches zero or more occurrences of the preceding element.
  • \ is used to escape special characters.
Additional Information:

The grep command is an essential tool for text processing and manipulation in the Linux terminal. Its versatility, combined with regular expressions, allows users to perform a wide range of tasks, including simple text searches, pattern matching, filtering, and data extraction. Whether it’s analyzing log files, searching for specific information in codebases, or performing system administration tasks, grep remains an indispensable utility for Linux users. Understanding its capabilities and various options can greatly enhance productivity and efficiency when working with text data in the terminal.

Closure:

So, there you have it. You have an AI-generated article about grep. It’s formatted quite differently than I’d normally format it, but it works. I dare say that AI did the job better than I had when I tried in the past.

People worry about AI but it’s just a tool. I am slowly learning when to make it useful to me. I’m not sure how many jobs AI is going to replace, but I can see lots of ways to use AI to make life easier. I can see ways to make AI more educational, such as in this article. It’s a nice overview, I think.

Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.

Subscribe To Our Newsletter
Get notified when new articles are published! It's free and I won't send you any spam.
Linux Tips
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.