Linux Tips • Getting you up to speed!

Update Python Packages (PIP)

We’ve had a run of Python packages recently and you can tell that I’m a fan because today we will discuss how to update Python packages that were installed via PIP. This should be a pretty easy article to follow along with.

Before diving into the world of installing Python packages from a centralized repository (via PIP), you should probably be familiar with the entire process. So, read these two articles before proceeding:

Install Python’s PIP Part One

And then follow up with this article:

Install Python’s PIP Part Two

It’s important to upgrade the packages you’ve installed with PIP. All software requires updates. Bugs are fixed with newer software but, more importantly, security issues are addressed with updates. This doesn’t just apply to Python. It applies to your whole computer. Software gets updated and you need to apply those updates.

So, today we’re going to do some maintenance and we’re going to update Python packages. Rather than waste time with a long intro, let’s get started!

Update Python Packages:

Just so you know, Python packages are installed in the terminal. So, it stands to reason that updates are also done in the terminal. To follow along in this article, you will need an open terminal. So, if you want to update Python packages you should start by opening a terminal. You can usually just CTRL + ALT + T to open your default terminal emulator.

With your terminal open, let’s first ensure PIP is installed with this command:

pip --version

1	pip --version

Next, make sure PIP is updated to the newest version:

pip install --upgrade pip

1	pip install --upgrade pip

With PIP upgraded to the most current version, you can check to see which Python packages you have previously installed. That’s done like this:

pip list

pip list

Now, you can see which packages can be updated to newer packages:

pip list --outdated

1	pip list --outdated

That will give you an output similar to this:

$  pip list --outdated
Package            Version          Latest      Type
------------------ ---------------- ----------- -----
argcomplete        1.10.3           3.3.0       wheel
async-timeout      4.0.1            4.0.3       wheel
bcrypt             3.2.0            4.1.2       wheel
beautifulsoup4     4.8.2            4.12.3      wheel
blinker            1.4              1.7.0       wheel
Brotli             1.0.9            1.1.0       wheel
build              1.0.3            1.2.1       wheel
certifi            2020.6.20        2024.2.2    wheel
chardet            3.0.4            5.2.0       wheel
click              8.0.3            8.1.7       wheel
colorama           0.4.4            0.4.6       wheel

$ pip list --outdated

Package Version Latest Type

------------------ ---------------- ----------- -----

argcomplete 1.10.3 3.3.0 wheel

async-timeout 4.0.1 4.0.3 wheel

bcrypt 3.2.0 4.1.2 wheel

beautifulsoup4 4.8.2 4.12.3 wheel

blinker 1.4 1.7.0 wheel

Brotli 1.0.9 1.1.0 wheel

build 1.0.3 1.2.1 wheel

certifi 2020.6.20 2024.2.2 wheel

chardet 3.0.4 5.2.0 wheel

click 8.0.3 8.1.7 wheel

colorama 0.4.4 0.4.6 wheel

Now, you can update the packages, like so:

pip install -U <package_name>

1	pip install -U <package_name>

You can also do multiple packages at the same time:

pip install -U <package_name> <package_name>

1	pip install -U <package_name> <package_name>

By doing this, you can update your Python packages, at least those installed via PIP. That is indeed pretty easy.

However, I have a command that I certainly didn’t come up with. This is a command I found in my notes and I do not see a reference URL – or I’d cite the source. Doing some searching, I saw that this command is referenced at multiple sites. So, finding the source is problematic for me.

If you want to upgrade all the Python packages at once, try this command:

pip freeze --local | grep -v '^\-e' | cut -d = -f 1&nbsp; | xargs -n1 pip install -U

1	pip freeze --local \| grep -v '^\-e' \| cut -d = -f 1  \| xargs -n1 pip install -U

I tested this and it appears to work well enough. PIP does love to throw errors in the terminal but generally works okay. That command should update all the packages you’ve installed with PIP – including any Python dependencies that were installed at the same time.

See? It’s pretty easy to update Python packages…

Closure:

Well, now you know how to update Python packages. I figured that this was an important article to write. If you’re going to use PIP to install Python packages, you might as well know how to keep yourself secure and how to keep yourself updated. That seemed reasonable.

However, my Python skills aren’t that great. I can do a Hello World program and that’s about it. I haven’t even done that in a while. So, don’t go asking me detailed Python questions! I probably won’t have an answer. My use is pretty limited to things I can trivially install with PIP.

Also, you may not want to ask me questions. While I’ll be polite, my time is constrained these days. I’m just as likely to refer you to a forum or two. You can ask questions. If they’re good, I’ll maybe answer them in an article. I’m just pointing out that you shouldn’t expect too much from me.

“If you don’t expect too much from me, you might not be let down.”

Thanks for reading! If you want to help, or if the site has helped you, you can donate, register to help, write an article, or buy inexpensive hosting to start your site. If you scroll down, you can sign up for the newsletter, vote for the article, and comment.

Save A Command’s Output To A File (While Showing It In The Terminal)

The title is the best I can come up with to describe this exercise as we’re going to save a command’s output to a file – while showing that same output in the terminal. This is something we’ve not quite done on this site before and something you might find interesting.

NOTE: This article assumes that you’re using Bash.

We’ve sort of covered redirect operators before. Read this article:

How To: Write Text To A File From The Terminal with “>” and “>>”

However, this time, you’re going to enter a command and see the output in the terminal, unlike what you’d see in the above-linked article. On top of that, you will simultaneously save that output to a file.

This can be handy to keep track of a command’s output over time. This can also be handy if you’re trying to audit a system and want to keep track of the output from the command.

For this article, we’re going to just use a simple example command. We’ll be making use of the uptime command because it’s easy and universal. If you’re using a desktop (or server) Linux, you have this command available.

How To: Find Your Uptime In Linux

This article is also going to make use of the tee command. If you’re a regular reader, you’ll know we’ve covered this command before. Then again, if you’re not a regular reader, you can just as easily learn about the tee command by reading the following article:

Mastering The Power Of Linux Tee Command

We’ll also be using the terminal, of course. We almost always use the terminal!

Save A Command’s Output To A File:

As suggested above, this is another article that relies on an open terminal. You can usually open a terminal by pressing CTRL + ALT + T on your keyboard. A terminal is otherwise available in your application menu.

With your terminal open, let’s just view your uptime with this command:

uptime

uptime

Next, we’re going to tell the command to show the output AND save the output to a file. That’s quite simple:

uptime 2>&1 | tee <file_name>.txt

1	uptime 2>&1 \| tee <file_name>.txt

So, let’s try this example command:

uptime 2>&1 | tee uptime.txt

1	uptime 2>&1 \| tee uptime.txt

Now, you can verify that this worked with this command:

cat uptime.txt

1	cat uptime.txt

Every time you run that command, it will clear out the existing text and write the most recent output to the uptime.txt file.

If you’d rather append the data, that’s easily done. It looks like this:

uptime 2>&1 | tee -a <file_name>.txt

1	uptime 2>&1 \| tee -a <file_name>.txt

Again, as a handy example:

uptime 2>&1 | tee -a uptime.txt

1	uptime 2>&1 \| tee -a uptime.txt

If you run that command multiple times and then check it with cat uptime.txt you’ll see that the -a flag will append the output. So, each time you run this command it will add the new output to the file.

That’s all there is to it. This is a handy thing if you want to monitor the output of a command over a period of time. You can alias the uptime command to this command and have a record of all the times you ran the uptime command in the terminal.

I’m sure there’s more that you can do with it, but that’s a basic idea that you can take with you. It’s a pretty handy command and one that I recently shared via PM with a Linux.org user. Seeing as it was on my mind, I figured I’d make it an article. I’d call it a ‘short’ article but the title was already too darned long!

Closure:

Now you know how to save a command’s output to a file while also showing the command’s output in the terminal itself. This is a handy enough command and easy enough to do. It seemed like it’d make an easy article for folks, so I wrote it. I don’t have any other justification, though it was not all that taxing to write.

In my defense, I deserve an easy article now and then!

Demystifying journalctl: A Comprehensive Guide to Linux System Logging

It was suggested that I write an article about journalctl, which seemed like a large topic. I decided that I’d let AI have a shot at it, so this article was written by ChatGPT.

It took a few prompts to get what I wanted – which turned out to be the first result. I gave the AI the chance to rework the article but the result was that I much preferred the initial offering. After all, I was only after a very light overview of the journalctl command.

There’s a lot to the journalctl command. The journalctl command is far too much to cover in a single article. Heck, I don’t even know some aspects of the command. You can see this by checking the man page with the following command:

man journalctl

1	man journalctl

See? There’s a lot to the command. At the end of the day, AI did a good job of summing up what you really need from the command as an average user. So, I’m going to go ahead and publish that content. It did a better job than I’d have done!

Introduction To journalctl:

In the realm of Linux system administration, understanding and managing system logs is indispensable. Logs provide crucial insights into the health, performance, and security of a system. Among the plethora of tools available for log management, journalctl stands out as a powerful and versatile command for accessing and analyzing logs in systems utilizing systemd. In this comprehensive guide, we will delve into the intricacies of journalctl, exploring its features, functionalities, and practical applications.

Understanding Systemd Journal:

Systemd, the init system adopted by many modern Linux distributions, introduced the systemd journal as a replacement for traditional syslog. The journal, stored in binary format, offers numerous advantages over syslog, including structured logging, faster search capabilities, and enhanced metadata.

journalctl serves as the primary interface for querying and interacting with the systemd journal. It provides administrators with a rich set of options for filtering, displaying, and analyzing log entries, empowering them to effectively troubleshoot issues, monitor system activity, and extract valuable insights.

Basic Usage:

At its core, journalctl allows users to retrieve and view log entries from the systemd journal. The simplest invocation of journalctl displays the entire journal, starting with the most recent entries:

journalctl

1	journalctl

This command presents a paginated output of log entries, including timestamps, log levels, and message contents. By default, journalctl displays logs from the current boot session. However, it also supports options for querying logs from previous boots or specific time ranges.

Filtering and Querying:

One of the key strengths of journalctl lies in its ability to filter log entries based on various criteria. Administrators can narrow down the search results by specifying filters such as time range, log level, systemd unit, or specific fields within log messages.

For example, to display all log entries generated by the sshd service, the following command can be used:

journalctl _SYSTEMD_UNIT=sshd.service

1	journalctl _SYSTEMD_UNIT=sshd.service

Similarly, to retrieve logs pertaining to a particular time range, the --since and --until options can be utilized:

journalctl --since "2024-04-01 00:00:00" --until "2024-04-15 23:59:59"

1	journalctl --since "2024-04-01 00:00:00" --until "2024-04-15 23:59:59"

Additionally, journalctl supports advanced filtering using Boolean expressions, enabling complex queries to be constructed for precise log retrieval.

Output Formatting and Presentation:

journalctl offers flexible options for customizing the format and presentation of log entries. Administrators can choose from various output formats, including short, verbose, and JSON formats, depending on their preferences and requirements.

For instance, to display log entries in a more detailed and verbose format, the -o verbose option can be used:

journalctl -o verbose

1	journalctl -o verbose

Moreover, journalctl provides options for controlling the display of timestamps, including the ability to convert timestamps to the local time zone or display them in a human-readable format.

Real-time Monitoring and Follow Mode:

In addition to viewing historical log entries, journalctl can also be used for real-time monitoring of system logs. By invoking journalctl with the -f or --follow option, administrators can tail the journal and receive live updates as new log entries are added:

journalctl -f

1	journalctl -f

This feature is particularly useful for monitoring system activity in real time, diagnosing issues as they occur, and tracking the progress of system services during startup and shutdown sequences.

Integration with Other Tools:

journalctl seamlessly integrates with other Linux system administration tools, enabling administrators to combine its capabilities with those of other utilities for more comprehensive log analysis and management.

For example, grep can be used in conjunction with journalctl to perform pattern matching and further refine log queries:

journalctl | grep "error"

1	journalctl \| grep "error"

Furthermore, administrators can leverage shell scripting and automation to automate log analysis tasks, extract meaningful insights, and trigger alerts based on specific log patterns or conditions.

Conclusion:

In conclusion, journalctl emerges as a powerful and indispensable tool for managing system logs in Linux environments utilizing systemd. Its rich feature set, flexible filtering options, and real-time monitoring capabilities make it an invaluable asset for system administrators, enabling them to effectively diagnose issues, monitor system activity, and ensure the stability and security of their systems.

By mastering the intricacies of journalctl, administrators can gain deeper insights into system behavior, streamline troubleshooting workflows, and maintain the optimal performance of their Linux infrastructure. Whether it’s debugging a system issue, auditing security events, or analyzing performance metrics, journalctl empowers administrators to navigate the complexities of system logging with confidence and efficiency.

Closure:

Well, I keep finding uses for AI. This is a better article than I’d have written. I think I’ll next use AI for a solid article about grep. That sounds like a reasonable subject and it’s an article that I can reference in many other articles. In fact, I should have done an article about grep already!

So, this is an article about journalctl. It’s an overwhelming command. It’s amazingly complicated and powerful, but you (as a regular user, as most of my readers are) will only need to know the basics. This is indeed the basics and they appear to be well-described.

Extract Text From Multiple File Types

Today we will have a fairly simple exercise as we’re going to just use a Python application to extract text from multiple file types. This is a pretty standard operation but will require some preparation.

Fortunately, I’m ahead of the game! You’re good to go if you follow along on the site and have already enabled PIP. Otherwise…

You will need to install PIP for this article. This is not complicated.

First, read this article:

Install Python’s PIP Part One

Technically, you could just do that. However, you should add the path so that you don’t have to specify the location of your Python applications and can easily use them from the terminal.

So, read this article:

Install Python’s PIP Part Two

Now that you’ve done those two things, you’re good to proceed. See? It was worth the time to write those articles! They’re useful and save a lot of time.

The tool we’re going to use is known as “Textract“. Don’t quote me on this, but I believe this could also apply to Windows users, though installing the dependencies for this would be a different process. I’m not a Windows user. If you are, feel free to comment and let us know how things work on your side of life.

Textract:

While there is no built-in man page, the Textract application is described like this:

While several packages exist for extracting content from each of these formats on their own, this package provides a single interface for extracting content from any type of file, without any irrelevant markup.

It is a pretty handy application and claims to extract the text from more file types than I could reasonably expect to test. Here’s a list of files that you should be able to extract text from.

.csv via python builtins
.doc via antiword
.docx via python-docx2txt
.eml via python builtins
.epub via ebooklib
.gif via tesseract-ocr
.jpg and .jpeg via tesseract-ocr
.json via python builtins
.html and .htm via beautifulsoup4
.mp3 via sox, SpeechRecognition, and pocketsphinx
.msg via msg-extractor
.odt via python builtins
.ogg via sox, SpeechRecognition, and pocketsphinx
.pdf via pdftotext (default) or pdfminer.six
.png via tesseract-ocr
.pptx via python-pptx
.ps via ps2text
.rtf via unrtf
.tiff and .tif via tesseract-ocr
.txt via python builtins
.wav via SpeechRecognition and pocketsphinx
.xlsx via xlrd
.xls via xlrd

You may need to install specific packages for some of these file formats. Those packages can usually be found in your default repositories. It otherwise comes with quite a lot of functionality out of the box.

I did test some of those formats and it seemed to work okay. Your mileage may vary, of course. However, Textract was able to extract text from multiple file types.

Extract Text From Multiple File Types:

If you want to extract text from multiple file types with Textract (a fantastic name for an application) then you’ll first need to install it. I’ve yet to find a working GUI PIP installation tool, so that means you’re going to need an open terminal.

More often than not, you can open your terminal by simply pressing CTRL + ALT + T on your keyboard. If your distro doesn’t adhere to the norms, you can find a terminal in your application menu. If you don’t use an application menu, you already know how to open a terminal and you don’t need any help from me.

First, let’s install Textract:

pip3 install textract

1	pip3 install textract

Note the lack of sudo. You’re installing this for your user account and do not need elevated permissions for this. Python packages go right into your ~/ directory. See below, as you’ll want to install some dependencies for full functionality.

You may see an error or two during installation but that doesn’t seem to matter. It will take a minute to install and watching the installation chug along is good fun.

Using Textract:

With Textract installed, you can now extract text from a whole variety of file types. The syntax is as follows:

textract /path/to/file

1	textract /path/to/file

That sends the output to the standard output (your terminal). I suspect that most folks are going to want to save the output to a file. For that, you just need to add the -o flag and a file name. So, something like this:

textract /path/to/file -o <new_file_name>.txt

1	textract /path/to/file -o <new_file_name>.txt

That’s going to extract the text from some file types but not all of them.

Now, this is from a Lubuntu installation…

This isn’t going to work with all the listed file types at this time. You need some dependencies to be installed. For me, and it’s a long one, the command was:

sudo apt install python2-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libpulse-dev

1	sudo apt install python2-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libpulse-dev

That’s slightly different from the command they include on their page, but it appears to do the trick. You’ll have some of those installed by default but running the command will sort itself out. You’ll have to modify the command to suit your distro, but that should work with Debian, Ubuntu, Linux Mint, and other Debian-based distros.

With that installed, I can even grab the text from image files.

Here’s an example:

a simple picture with simple text — This is some simple text to test how well Textract really works.

Here’s the command:

textract sample_text.png

1	textract sample_text.png

Here’s the output:

$ textract simple_text.png
This is some simple text.
Let's see Textract in operation!

$ textract simple_text.png

This is some simple text.

Let's see Textract in operation!

I dare say that’s pretty good. I tried other pictures and it was good enough to get the gist of things. Complicated image files with many columns appear to be a bit of a stumbling block. But it’s not terrible.

It has no trouble at all with other file formats.

It can be a bit fussy to get Textract properly installed but it seems to do the trick once installed. If you want to extract text from multiple file types, Textract is a pretty good piece of software.

Closure:

If you want to extract text from multiple file types, this is definitely a good tool for the job. It certainly handles a lot of files and does a good job with them. It’s not perfect. None of these tools are. Complicated image files threw it off a bit, but Textract lives up to its name.

There was a reason I wrote those articles about PIP. Being able to install Python packages via a repository is a great thing. There’s some great Python software out there and we’ve barely touched the surface. Linux is great like that, that is offering great Python support.

Do you have a use for this in your daily activities? If so, leave a comment letting us know how you use Textract and what makes you pick it over other applications. You can even use a real email address. I never send spam. I never sell your information.

Meta: I’ve Been At This For Three Years!

I have not done a meta article lately. I don’t find them interesting to write even though people seem to show some interest in them. They’re a pain to write and I can write a regular article easier than a meta article.

After all, my normal meta article is just updating you on how much traffic the site gets and how much bandwidth the site uses. I also tend to toss in some stats about where the traffic comes from and what that traffic looks like.

That’s not all that fun and can be a pain to write.

So, let’s get that out of the way…

February wasn’t as great but January was awesome. Last month saw a new level reached – where this site averaged more than 1000 visitors per day.

visit statistics — Yup… There are a lot of you these days…

I guess the next major threshold will be when I get 1000 unique visitors in a month. Quite a few of you visit more than once, those are the ‘visits’ in the chart. The unique visitors are just that, how many unique people visit the site.

Sure enough, my traffic still comes from Google – even though I no longer use AdSense. As far as search traffic, the next two most popular search engines (for this site) are DuckDuckGo and StartPage.

The most popular operating systems are Linux, MacOS, and Windows. I suppose that makes some sense. Some of what’s on this site applies to MacOS as they’re a POSIX-compliant operating system.

The most popular browsers are Chrome (and those that identify as Chrome), Unknown, and Safari. Firefox isn’t well represented here. They’re 4th on the list with about 19% of my traffic.

My legitimate traffic comes from the United States, Germany, and the United Kingdom. China and Russia are in the stats, but they’re mostly bots. Finding accurate stats to pick between them can be difficult. So, the above is about what I can figure out.

Yeah, I regularly consume 50+ GB of traffic per month. Considering the site is pretty much pure text, that’s a lot of writing. There are millions of words on the site now and I’m still not out of ideas.

Anniversary!

This is our third anniversary! Three years!!! Whodathunkit?

This was the first article:

Welcome to Our New Home!

I’ve had some guest articles along the way (and at least another one coming when I can schedule it). Most of the articles, 99.999% at least, have been written by me.

We don’t get a lot of donations but we get a couple here and there. That helps cover the costs, plus people can now advertise here on the site. That’s helpful. I do love my CDN though it (and hosting) are expenses.

This has taken a whole lot of my time. I value my time, but I guess I also value the site. Otherwise, I’d have stopped publishing articles. One of these days I’ll quit but I plan on keeping up with the schedule for the time being.

That schedule? Well, you get a new article every other day. Most of those articles don’t contain any major errors, which is nice. I can’t be perfect all of the time, but I do my best. You’re welcome.

I’m working my way towards a million visits. That’s nice.

Here are the most popular articles:

How To: Remove AppArmor From Ubuntu
Change Between CLI and GUI Mode
How To: Disable Sleep And Hibernation on Ubuntu Server

Once upon a time, I was stoked to see 20 visits in a day. So, I guess I’ve built something here. Pardon me, but I’m a wee bit proud of my accomplishment. It has come a long way.

As such, I’m just going to keep this short.

Thank you. Thank you for your readership and encouragement. Here’s to another year. I think I’ve got another year of this in me. We’ll have to see. I’m bound to miss a day eventually, but this is three years without doing so (technically).

Closure:

That’s all, folks. I appreciate you.