Skip to content

Both.org

News, Opinion, Tutorials, and Community for Linux Users and SysAdmins

Primary Menu
  • About Us
  • Computers 101
    • Hardware 101
    • Operating Systems 101
  • End of 10 Events
    • Wake Forest, NC, — 2025-09-20
  • Linux
    • Why I use Linux
    • The real reason we use Linux
  • My Linux Books
    • systemd for Linux SysAdmins
    • Using and Administering Linux – Zero to SysAdmin: 2nd Edition
    • The Linux Philosophy for SysAdmins
    • Linux for Small Business Owners
    • Errata
      • Errata for The Linux Philosophy for SysAdmins
      • Errata for Using and Administering Linux — 1st Edition
      • Errata for Using and Administering Linux — 2nd Edition
  • Open Source Resources
    • What is Open Source?
    • What is Linux?
    • What is Open Source Software?
    • The Open Source Way
  • Write for us
    • Submission and Style guide
    • Advertising statement
  • Downloads
  • Home
  • Using rsync for Backup
  • Linux
  • System Administration

Using rsync for Backup

There are many options for performing backups. Most Linux distributions are provided with one or more open source programs especially designed to perform backups. There are many commercial options available as well. But none of those directly met my needs so I decided to use basic Linux tools to do the job.
David Both April 6, 2024 8 minutes read
ian-battaglia-9drS5E_Rguc-unsplash-Cropped
2

Last Updated on July 15, 2024 by David Both

Image by: Creative Commons: CC-by-SA

Backup options

There are many options for performing backups. Most Linux distributions are provided with one or more open source programs especially designed to perform backups. There are many commercial options available as well. But none of those directly met my needs so I decided to use basic Linux tools to do the job.

In a previous article, Using tar and ssh for Backups, I showed that fancy and expensive backup programs are not really necessary to design and implement a viable backup program.

Several years ago I started experimenting with another backup option, the rsync command which has some very interesting features that I have been able to use to good advantage. My primary objectives were to create backups from which users could locate and restore files without having to untar a backup tarball, and to reduce the amount of time taken to create the backups.

This article is intended only to describe my own use of rsync in a backup scenario. It is not a look at all of the capabilities of rsync or the many other interesting ways in which it can be used.

rsync

The rsync command was written by Andrew Tridgell and Paul Mackerras and first released in 1996. The primary intention for rsync is to remotely synchronize the files on one computer with those on another. Did you notice what they did to create the name there? rsync is open source software and is provided with all of the distros with which I am familiar.

The rsync command can be used to synchronize two directories or directory trees whether they are on the same computer or on different computers but it can do so much more than that. rsync creates or updates the target directory to be identical to the source directory. The target directory is freely accessible by all the usual Linux tools because it is not stored in a tarball or zip file or any other archival file type; it is just a regular directory with regular files that can be navigated by regular users using basic Linux tools. This meets one of my primary objectives.

One of the most important features of rsync is the method it uses to synchronize preexisting files that have changed in the source directory. Rather than copying the entire file from the source, it uses checksums to compare blocks of the source and target files. If all of the blocks in the two files are the same, no data is transferred. If the data differs, only the block that has changed on the source is transferred to the target. This saves an immense amount of time and network bandwidth for remote sync. For example, when I first used my rsync BASH script to back up all of my hosts to a large external USB hard drive, it took about 3 hours. That is because all of the data had to be transferred because none of it had been previously backed up. Subsequent syncs took between 3 and 8 minutes of real time, depending upon how many files had been changed or created since the previous sync. I used the time command to determine this so it is empirical data. Last night, for example, it took 3 minutes and 12 seconds to complete a sync of approximately 750GB of data from 6 remote systems and the local workstation. Of course, only a few hundred megabytes of data were actually altered during the day and needed to be synchronized.

The following simple rsync command can be used to synchronize the contents of two directories and any of their subdirectories. That is, the contents of the target directory are synchronized with the contents of the source directory so that at the end of the sync, the target directory is identical to the source directory.

# rsync -aH sourcedir targetdir

The -a option is for archive mode which preserves permissions, ownerships and symbolic (soft) links. The -H is used to preserve hard links. Note that either the source or target directories can be on a remote host.

Now let’s assume that yesterday we used rsync to synchronized two directories. Today we want to resync them, but we have deleted some files from the source directory. The normal way in which rsync would do this is to simply copy all the new or changed files to the target location and leave the deleted files in place on the target. This may be the behavior you want, but if you would prefer that files deleted from the source also be deleted from the target, you can add the --delete option to make that happen.

Another interesting option, and my personal favorite because it increases the power and flexibility of rsync immensely, is the --link-dest option. The –-link-dest option allows a series of daily backups that take up very little additional space for each day and also take very little time to create.

Specify the previous day’s target directory with this option and a new directory for today. rsync then creates today’s new directory and a hard link for each file in yesterday’s directory is created in today’s directory. So we now have a bunch of hard links to yesterday’s files in today’s directory. No new files have been created or duplicated. Just a bunch of hard links have been created. Wikipedia has a very good description of hard links. After creating the target directory for today with this set of hard links to yesterday’s target directory, rsync performs its sync as usual, but when a change is detected in a file, the target hard link is replaced by a copy of the file from yesterday and the changes to the file are then copied from the source to the target.

So now our command looks like the following.

rsync -aH --delete --link-dest=yesterdaystargetdir sourcedir todaystargetdir

There are also times when it is desirable to exclude certain directories or files from being synchronized. For this there is the --exclude option. Use this option and the pattern for the files or directories you want to exclude. You might want to exclude browser cache files so your new command will look like this.

rsync -aH --delete --exclude Cache --link-dest=yesterdaystargetdir sourcedir todaystargetdir

Note that each file pattern you want to exclude must have a separate exclude option.

rsync can sync files with remote hosts as either the source or the target. For the next example, let’s assume that the source directory is on a remote computer with the hostname remote1 and the target directory is on the local host. Even though SSH is the default communications protocol used when transferring data to or from a remote host, I always add the ssh option. The command now looks like this.

rsync -aH -e ssh --delete --exclude Cache --link-dest=yesterdaystargetdir remote1:sourcedir todaystargetdir

This is the final form of my rsync backup command.

rsync has a very large number of options that you can use to customize the synchronization process. For the most part, the relatively simple commands that I have described here are perfect for making backups for my personal needs. Be sure to read the extensive man page for rsync to learn about more of its capabilities as well as the options discussed here.

Performing backups

I automated my backups because – “automate everything.” I wrote a BASH script that handles the details of creating a series of daily backups using rsync. This includes ensuring that the backup medium is mounted, generating the names for yesterday and today’s backup directories, creating appropriate directory structures on the backup medium if they are not already there, performing the actual backups and unmounting the medium.

I run the script daily, early every morning, as a cron job to ensure that I never forget to perform my backups.

Recovery Testing

No backup regimen would be complete without testing. You should regularly test recovery of random files or entire directory structures to ensure not only that the backups are working, but that the data in the backups can be recovered for use after a disaster. I have seen too many instances where a backup could not be restored for one reason or another and valuable data was lost because the lack of testing prevented discovery of the problem.

Just select a file or directory to test and restore it to a test location such as /tmp so that you won’t overwrite a file that may have been updated since the backup was performed. Verify that the files’ contents are as you expect them to be. Restoring files from a backup made using the rsync commands above simply a matter of finding the file you want to restore from the backup and then copying it to the location you want to restore it to.

I have had a few circumstances where I have had to restore individual files and, occasionally, a complete directory structure. Most of the time this has been self-inflicted when I accidentally deleted a file or directory. At least a few times it has been due to a crashed hard drive. So those backups do come in handy.

Read my article, How my easy, home-made backup program saves time, space on the storage medium, and network bandwidth, for a complete description of my backup program and its use of rsync,


Resources

  • Wikipedia: rsync
  • Wikipedia: Hard Links

You can download my rsbu program and its supporting files from this page. It’s distributed under the GPL V3 license.

Tags: Backups rsync

Post navigation

Previous: How to update a Linux symlink
Next: The real differences between less, more, and most

Related Stories

f44-01-day-cropped
  • Fedora
  • Linux
  • Upgrades

Fedora 44 Released

David Both April 28, 2026
command_line_prompt
  • Command Line
  • Linux
  • Programming

Writing a replacement seq command

Jim Hall April 27, 2026
dark-web
  • Linux
  • News
  • Security
  • Vulnerability

High-severity Vulnerability in PackageKit

David Both April 25, 2026

Random Quote

Technology is dominated by two types of people: those who understand what they do not manage, and those who manage what they do not understand.

— Archibald Putt

Why I’ve Never Used Windows

On February 12 I gave a presentation at the Triangle Linux Users Group (TriLUG) about why I use Linux and why I’ve never used Windows.

Here’s the link to the video: https://www.youtube.com/live/uCK_haOXPFM 

Why there’s no such thing as AI

Last October at All Things Open (ATO) I was interviewed by Jason Hibbits of We Love Open Source. It’s posted in the article “Why today’s AI isn’t intelligent (yet)“.

Technically We Write — Our Partner Site

Our partner site, Technically We Write, has published a number of articles from several contributors to Both.org. Check them out.

Technically We Write is a community of technical writers, technical editors, copyeditors, web content writers, and all other roles in technical communication.

Subscribe to Both.org

To comment on articles, you must have an account.

Send your desired user ID, first and last name, and an email address for login (this must be the same email address used to register) to subscribe@both.org with “Subscribe” as the subject line.

You’ll receive a confirmation of your subscription with your initial password as soon as we are able to process it.

Administration

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

License and AI Statements

Both.org aims to publish everything under a Creative Commons Attribution ShareAlike license. Some items may be published under a different license. You are responsible to verify permissions before reusing content from this website.

The opinions expressed are those of the individual authors, not Both.org.

You may not use this content to train AI.

 

Advertising Statement

Both.org does not sell advertising on this website.


Advertising may keep most websites running—but at Both.org, we’re committed to keeping our corner of the web ad-free. Both.org does not sell advertising on the website. Nor do we offer sponsored articles at this time. We’ll update this page if our position on sponsorships changes.

We want to be open about how the website is funded. Both.org is supported entirely by David Both and a few other dedicated individuals.

 

 

Copyright © All rights reserved. | MoreNews by AF themes.