Fedora 33 name resolution fails

I just installed Fedora 33 today, the first day it became available. One of the major changes, a switch from the ancient nss resolver to systemd-resolved has already caused me a significant amount of trouble and borked my entire network. Get the whole story and the circumvention.

How I borked my computer

Even seasoned Sysadmins can have epic fails

And this was mine. It was a bit frustrating – well, a lot frustrating. I managed to totally bork my primary workstation while trying to perform some hardware upgrades along with a restructuring of my storage configuration. The story is a bit long and consists of several intersecting events that took place over a period of weeks.

I have been working with computers for over 50 years and using Linux for almost 25. I should have known better.

Installing the first SSD

It started when I began migrating my primary workstation to SSDs. You can read the long story of that here, but this is the short version.

Having noticed that my System76 Oryx Pro laptop, with its SSDs, booted much faster than my primary workstation, I decided to convert at least one of my 4 internal hard drives to SSD.

I had previously purchased an Intel 512GB m.2 NVMe SSD for a customer project that was cancelled. I ran across that SSD while looking through my few remaining hard drives. Did I mention that my laptop boots really, really fast? And my primary workstation did not.

I have also wanted to do a complete Fedora reinstallation for a few months because I have been doing release upgrades since about Fedora 21. Sometimes doing a fresh install to get rid of some of the cruft is a good idea. All things considered, it seemed like a good idea to do the reinstall of Fedora on the SSD.

I installed the SSD in one of the two m.2 slots on my ASUS TUF X299 motherboard and installed Fedora on it, created vg01 to fill the entire device, and placed all of the operating system and application program filesystems on it, /boot, /boot/eufi, / (root), /var, /usr, and /tmp. I chose not to place the swap partition on the SSD because I have enough RAM that the swap partition is almost never used. Also, /home would remain on its own partition on an HDD.

The installation went very smoothly. After this I ran a Bash program I wrote to install and configure various tools and application software. That also went well – and fast – very fast.

And my workstation booted and ran much faster.

Display problems

Then, a few weeks ago, my primary display, a Dell with 2560×1600 resolution failed. It had started blanking out – going totally dark – for a few seconds and progressing to longer and more frequent blackouts. Until it blacked out and never recovered.

I purchased a new LG 32″ display with a maximum 3840×2160 resolution. The high res failed with my 10 year old graphics adapter so I had to purchase a new Sapphire Radeon 11265-05-20G to drive it. Then I had to reconfigure my desktop and apps to deal with the HiDPI display so I could read everything.

This problem did not directly affect how or why I borked my workstation, but it was one of several things happening at that time.

The second SSD

A few weeks after I performed the initial migration I decided to install another M.2 SSD in the second slot on my motherboard. I wanted to do this to speed access to my /home directory which was still located on an HDD. Also, I could then move swap to the first SSD which still had lots of room and then remove the HDD which would be empty.

I have an APC UPS which tells me how many Watts of power are being consumed and I was surprised at how much difference it made to move from HDD to SDD devices. Although a bit fuzzy, I estimate that I save about 20 (continuous) watts per device, which works out to about 480Watt-hours per day per device.

I moved my home directory to the new SSD which was created as vg02, turned off swap, deleted the old swap volume, and created a new 10GB swap volume on the original SSD on vg01 because there was still plenty of space there.

I had to changed the entry in /etc/fstab to reflect the new locations for those two logical volumes.

/dev/mapper/vg02-home    /tmp    ext4    discard,defaults 1 2
/dev/mapper/vg01-swap    none    swap    discard,defaults    0 0

I turned swap back on and all was good – until I rebooted. The startup sequence – when systemd takes over – locked up at about 2.6 seconds after starting. A bit of investigation showed that the /etc/defaults/grub local configuration file still contained a reference to the old swap location in the Linux kernel option line.

I changed that line to the following:

GRUB_CMDLINE_LINUX="resume=/dev/mapper/vg01-swap rd.lvm.lv=vg01/root rd.lvm.lv=vg01/swap rd.lvm.lv=vg01/usr"

I then ran the following command to recreate the grub2 configuration file.

# grub2-mkconfig > /boot/grub2/grub.cfg

I rebooted and all was well.

A bit of additional testing resulted in significantly improved times for applications to load data from my home directory which was the whole idea.

About testing

The reboot I did of my workstation after making the volume changes is always a part of my testing procedures. Any time I make a change that affects the runtime or startup configuration of the operating system I always perform a reboot to verify that none of my changes have caused problems with boot and startup. In this case it had and I was able to fix it immediately.

You do have a standard testing procedure that you use after making changes – right?

The third SSD

By this time I had one more volume located on a hard drisk that I wanted to move to an SSD to improve performance. I have over 20 virtual machines that I use for testing various Linux distributions and releases. They would still load and run fairly slowly because they were on the HDD. So I purchased a SATA, 2.5″ SDD because I was out of M.2 PCIe slots on my motherboard.

The installation was as easy as any SATA device and I created a logical volume on which I could store my virtual machines. After moving the VMs to the new volume, a little testing showed significantly improved speeds.

My misteak

So after all of those changes I decided to move some other files around and restore some older ones from an old backup just so I could have them on-line again.

I needed to change the ownership of some of the restored files. I entered the command but mistyped something and I managed to run chown on most of the files in /usr, /var, /bin, and more.

A bit of fussing failed so I reinstalled but then my home directory would prevent me from logging in with a permissions error. Re-copying and changing permissions did not work. So I did another reinstall and intentionally wiped my /home volume. After running my post-install script and restoring from the most recent backup I was up and running again.

Final thoughts

I got so caught up in making all these changes that I just neglected to verify the correctness of the command I typed. It happened to me and it can happen to you.

I learned from this, as I do from all of my mistakes. That is all we can do; fix the self-inflicted problem and learn from it so we don’t do it again. At least not any time soon. ;-)

Migrating to SSD.

A few weeks ago I installed an M.2 SSD drive in my primary workstation, and just today I installed a second one. The complete story is a bit long for a post and it really belongs on my technical website, the DataBook for Linux.

The article, Converting to SSD, covers the initial conversion and then the addition of a second M.2 SSD and the problems I had when migrating a swap volume to the SSD. It’s probably not what you think.

Anyway, I hope it helps.

Installing Fedora 32 on System76 Oryx Pro

In November 2018 I ordered an Oryx Pro laptop with a 17″ display from System76. This laptop came with the System76 version of Ubuntu, POP_os!. This is an amazing laptop with 6 cores (12 CPUs) and 32GB of RAM, more than enough to run multiple VMs simultaneously, which I sometimes do when traveling or presenting.

After using POP_os! for a few months, I wanted to install Fedora on it because I have it on all of my other systems and this would make it a bit simpler to manage. I installed Fedora 29 (at the time) on a second internal SSD. I ran into problems with this because Fedora would hang at the point where the display manager was to start. The screen would be blank and the system was unresponsive. I did not spend much time with this because I had things to do and POP_os! worked fine.

And then a few weeks ago I was working on a woefully under-powered Dell Inspiron 3452 laptop and installed Fedora 32 on it. It failed the same way that my Oryx Pro did.

So here is my thought process. Fedora installed on both laptops — and failed to boot with the same symptom. POP_os! worked fine on the Oryx. The Fedora installation works from a Live USB thumb drive and does not use a display manager. POP_os! uses GNOME for a desktop and the GNOME display manager, gdm. I had been installing from a Fedora spin that uses the Xfce desktop and the lightdm display manager. Xfce does not have a display manager of its own but can use any of the others, typically lightdm, xdm, or lxdm.

I downloaded and installed the Fedora Workstation version which uses GNOME and gdm. In both cases I could now login to the laptops successfully. I then installed Xfce, my favorite desktop but continued to use gdm. Everything now works as it should and I have my favorite desktop back.

This is only one example of the power and flexibility of Linux. I can pick and choose the components I need. Not only can I use different combinations like this to solve problems, I can use them just because they work better for me or I like one better than the other. Choices and flexibility – Linux Rocks!

Fedora 32 rocks – with a couple issues

Fedora 32 became available yesterday and – LinuxGeek46 that I am – I managed to upgrade 7 out of 8 of my Fedora hosts yesterday as well. It is really cool and works well as I have come to expect from all of my Fedora upgrades in the past. There are some big changes underneath but this little post is not about that. Rather, this post is about the one problem I encountered.

I had no problems performing the upgrades. I have a script that I wrote to perform all the steps as shown here using dnf system upgrade so it was easy for me to do each host. The problem occurred after the reboot of my network server.

The symptom of this problem is that all the network services I run failed to start correctly. This included DHCPD, NAMED, SendMail, HTTPD, and more. The systemd status <service> command showed that each of the services had attempted to start but had failed with errors. This was probably due to the large amount of work that the server needed to perform in the background before it was really ready to run those services.

Theoretically, systemd should start everything in parallel so long as the network is up and running. I have discovered that this is not actually true in an edge case like this.

The solution to this was to start each of the services manually. A reboot would have worked as well.

Other than that I am very happy with Fedora 32.

Copy and Paste fails

After a recent upgrade to my primary workstation, and a new installation on a couple virtual machines, I discovered that copy and paste was not working. I use both Xfce and LXDE for my desktops on various hosts, both physical and virtual and the problem occurred with both desktops.

After a bit of thrashing around I discovered that the problem is a configuration item that has changed in more recent updates to the clipboard on these desktops. The changed configuration item specifies that the clipboard is to be automatically cleared after a default timeout 1 second. I suspect that this configuration was added to free up memory in hosts that have a small memory installation. After all, both of these desktops are designed for lightweight hosts with fewer resources than most.

The fix

Figure 1: Remove the check mark from the “Purge history…” box.

Start by locating the clipboard icon on the desktop panel. Right-click on the clipboard icon, which displays the clipboard preferences dialog box. Then click on the History tab to view the dialog shown in Figure 1.

So the fix could be either of two things. One, I could set the timeout to be longer. Or, Two, I could simply turn it off completely. I chose the latter method.

I simply removed the checkmark from the “Purge history after timeout” box. You might notice that I also changed the default to 30 seconds, just in case this box gets checked again.

I have not had any repeats of this problem since I made the change.

Migrate Thunderbird Config from one Linux Computer to Another

While getting ready to do a presentation at Open Source 101 in a few days, I decided that it would be good to reinstall POP_os! on my System 76 laptop. After that I needed to install Thunderbird and migrate my Thunderbird profile to the newly installed laptop.

Due to the lack of good, accurate information about how to migrate Thunderbird profiles from one computer to another in Linux, I decided to write an article about what I discovered and share it. I have placed this document on my technical web site at Migrate Thunderbird Config from one Linux Computer to Another.

I hope it is helpful.

Security by obscurity — NOT!

As you can see in the posts below I switched internet service providers on Monday of this week. As a result I received a different block of IP addresses than I had before.

I have always heard that it only takes a few minutes for an attack to start on a computer – or any other device like phones and tablets – that is newly connected to the Internet. I determined to see how many (not if) script-kiddie attacks via SSH took place on the first full day after the changeover.

During the full day after getting new IP addresses, I experienced a total of 1634 attack attempts from 37 different IP addresses. I obtained this information from the Logwatch tool which I describe in volumes 2 and 3 of my “Using and Administering Linux: Zero to SysAdmin” series of books.

The crackers behind these attempts are not just searching for new computers to attack. They make the assumption that there is a computer at every IP address and attack regardless. If there is no computer at one IP address they move on to the next.

The point is that your computer or device is not safe just because it was connected to the internet five minutes ago. There are constant attacks going on and your device needs to be protected before it is connected.

Note that this is only one type of attack. There are many others that I did not even consider in this post.

Network migration complete

The migration to AT&T fiber is now complete and everything went very well. Of course that is not to say it was problem-free.

I have never been a fan of AT&T but my previous provider has been unable to resolve issues with the network just dropping out and the modem/router rebooting at frequent and inopportune times. But the speed of fiber and the fact that it is symmetric with upload and download speeds at 1Gb rather than uploads being so much slower as wih my old provider, and the fact that it is significantly less expensive, I decided to switch.

I wanted to go with residential service which is much less expensive but I had some concerns about needing static IP addresses and with issues I have seen with blocked ports like 25 for email. I run my own web and email servers so that was important to me. After a chat session with a fairly knowledgeable rep and talking with a sales person on the phone, they both said that the static IP addresses were not a problem and that the installation tech could help set that up as well as deal with blocked ports.

They were right. Which was a surprise to me.

Scott, the installation tech called me the morning of the installation to let me know he was on the way and he was delayed only slightly due to traffic. We discussed my needs for a few minutes and he assured me that we could do exactly what I needed. As a gamer, he was very knowledgeable and understood what I wanted and why.

After doing the physical installation of running the fiber from the street to my home office, we worked together to install the modem/router in my desired location and get it and the ONT plugged into a UPS, cabled together, and connected to the fiber. I would not let him into the narrow space available to do that so we worked together on it.

He installed updates to the Arris modem/router and we were ready to go. He showed my on his hand-held tester that the rates were both within a decimal point of 1Gb. We easily got the static IP addresses configured on the router.

I then reconfigured my own internal router. We did have some issues with blocked ports. Although I could browse the web and SSH to remote hosts, nothing was able to initiate connections to my router/firewall. After calling around to various support systems inside AT&T, Sctt and I figured out how to unblock the needed ports and everything was working fine.

I did have some issues with speeds, but those problems were with my own older Linux computer that I was using for my router/firewall. I moved the hard drive from that machine to a newer one, installed the needed network adapters, made a few configuration changes and all is now well.

It just took longer than I expected but everything seems to be working very well now. Thanks for your patience and I hope you were not inconvenienced by the outages during this time.

Network migration Monday, January 20

Outages through Monday

Due to a large number of intermittent outages with my current internet provider, I have decided to move to a new provider. These outages make access to my web sites with their information about my books, unavailable at random times. Please keep trying if you have problems and you should ultimately get through. The outages last several minutes at a time. This problem also delays both inbound and outbound emails.

The intermittent outages will continue though the weekend and there will be a fairly long outage of several hours on Monday as the new service is installed and I get DNS updated.

Thanks for your patience.

Speaking at Open Source 101

I will be speaking at Open Source 101 in Columbia, SC, on March 3. I will present an extended 3 hour session entitled, “Configuring and Using Bash.” This session is intended for Linux users and SysAdmins of all experience levels.

Book signing

I will also be signing copies of all four of my books. There will only be a few copies of each of my books available so there is a limit of one book per person. However, during my session I will give away one full set of all four of my books.

Abstract

The Bash shell is the default shell for almost every Linux distribution. As the Lazy SysAdmin, understanding and using the available tools to configure the Bash shell can enhance and simplify our command line experience.

In this session, which is largely based on Chapter 17 of my book, Using and Administering Linux: Volume 1 – Zero to SysAdmin: Getting Started, you will explore the several Bash configuration files for both global configuration and for users’ local configuration. You will perform simple experiments to determine the sequence in which the Bash configuration files are executed when the shell is launched.

You will explore environment variables and shell variables such as $PATH, $?, $EDITOR, and more and how they contribute to the behavior of the shell itself and the programs that run in a shell.

In this session you will learn:

  • The difference between a login shell and a non-login shell. In the interest of clearing up any confusion we will also learn about the nologin shell.
  • How the Bash shell is configured
  • How to modify the configuration of the Bash shell
  • Which Bash configuration scripts are run when it is launched as a login shell and as a non-login shell
  • The names and locations of the files used to configure Linux shells at both global and user levels
  • Which shell configuration files should not be changed
  • How to set shell options
  • How to set environment variables from the command line
  • How to set environment variables using shell configuration files
  • The function of aliases and how to set them
  • How to have some fun on the Bash command line

Great first review

I am really happy to get a great 5-star first review on my Linux self-study series, Using and Administering Linux – Zero to SysAdmin. Among other things, the reviewer says, “…these 3 books are a superb new resource for newbies, experienced users, and ‘front-line’ SysAdmins.” The full review.

“Using and Administering Linux – Zero to SysAdmin” to be translated into Korean

This morning I learned from my editor that Apress has licensed all three volumes of Using and Administering Linux – Zero to SysAdmin for translation into Korean. This is a big deal and I am really excited about it.