3 steps to identifying Linux system automation candidates

Photo by Breakingpic from Pexels

How do you know what to automate first on your network? Here are three steps to put you on the right path.

Automating the tasks we perform is one of the most important parts of our jobs as sysadmins. It’s not just about performing those many tasks required to keep the systems we support up and running. It’s about making it easy on ourselves and other sysadmins who might stand in for us while we are on vacation or out sick; it’s about ensuring that we can perform our jobs quickly and easily with a minimum of work and intervention on our part; it’s about—hmmm, should I really say this—about being the lazy sysadmin.

I’ve written extensively about automation in my books and articles, and my mantra is always, “automate everything.” But how do you know where to start?

The pain point

I started down the road to automation by reducing a major pain point for one of the most important tasks that sysadmins perform—backups. I started with a very small network: one computer and an Internet connection. Backups were easy, although the technology was a series of tape drives that eventually failed.

Initially, I typed in a command on Friday evenings to backup all of my important directories and sometimes checked to verify that the backups were successfully created. They were—mainly—because of tape.

As my network grew and I became responsible for networks other than my own, I found that using the command line to make multiple backups became quite tedious. However, technology advanced and I also discovered that external USB hard drives make an excellent backup medium, and a script makes backing up several computers much easier. Using cron jobs or systemd timers also allows me to schedule backups.

My current backup system uses a Bash script that employs rsync to create backups of up to a dozen computers in my existing home network. The backups are first created on a 4TB internal hard drive and then written to one of a series of external 4TB USB hard drives. I can easily transport the external drives to my safe deposit box for off-site backup. You can read about the details of this backup system in my article, How my easy, home-made backup program saves time, space on the storage medium, and network bandwidth. The key is to find your most intense pain point and start with that.

My strategy

I really have only one strategy for determining what to automate first—or next. It’s to simply determine the task that causes me the most pain at the present moment in time. That pain could be having to spend a lot of time repeatedly typing the same commands, waiting for things to happen before entering the next command, remembering the proper syntax for commands I use frequently, or whatever.

You probably already know the source of the most pain in your sysadmin life. That’s the first thing you should consider automating, especially if it’s relatively small and not as major or important as a complete, advanced backup system. I started with a straightforward backup system that used tar and some fun features of SSH, which I wrote about in Best Couple of 2015: tar and ssh.

Other pain points for me have been performing Fedora updates, including security and functional fixes as well as feature enhancements. This also includes performing upgrades from one Fedora release to the next, such as from Fedora 32 to Fedora 33.

There are also many options for implementing automation regardless of the task. One part of my strategy has been to start by using scripts to fully understand the solutions and any problems that might be encountered. I’ll write a script to solve a problem on one host, copy it to all hosts on the network, and then type in command line Bash programs to perform that task on all the hosts. This takes the form:

for host-name in `cat ~/list-of-hosts` ; do ssh host-name "script-name"; done

But even that becomes a chore and another pain point with enough hosts on enough networks. It can also be problematic when some hosts need to be treated differently from others. I found that more advanced tools such as Ansible can automate tasks over many hosts on a network while treating certain types, such as servers, differently from standard workstations. Ansible doesn’t require the distribution of scripts onto each host to perform its work; it doesn’t even need to be installed on each host—only on the system used as the “hub.”

The PHB pain point

We have all had Pointy-Haired-Bosses (PHBs), and sometimes they are the pain point. Suppose some PHB asks for a list of all RPMs on a particular Linux computer and a short description of each. This happened to me while I worked at the State of North Carolina. Open source was not “approved” for use by state agencies at that time, and I only used Linux on my desktop computer. The PHBs needed a list of each piece of software installed on my system so that they could “approve” an exception.

It took me about five minutes to write a quick script that could be run as many times in the future as they asked me this same question. It listed the RPM packages installed on my host and extracted the description from each package. This script produced a list of over 1,900 packages with a short description of each. I sent this list to the PHB who had requested it and never heard back about it again—ever.

Sometimes the pain point is easily—and quickly—resolved. But the PHBs usually demand immediate attention.

Final thoughts

I started by creating a simple automation script to address the task that caused me the most pain. I then moved on to the next pain point, and so on. Eventually, those original pain points come back and need to be refined using more advanced tools such as Ansible. This is an iterative process that will never end.