5 advanced rsync tips for Linux sysadmins
In a previous article entitled Sysadmin tools: Using rsync to manage backup, restore, and file synchronization, I discussed cp
and sftp
, and looked at the basics of rsync
for moving files around. There are also a couple of other great articles here on Enable Sysadmin on tar and SSH you should take a look at. Copying files to and from remote systems and having an easy way to run a backup of something you're working on (or, for that matter, critical company data) are basic, useful tools in the sysadmin toolbox that I use again and again. Sometimes, however, you may want to do something a little more sophisticated, like move data across a less trusted or slower link. Rsync
can provide encryption to protect it in transit, compression to make it flow better, and checksums to ensure you get what you were expecting.
[ Readers also liked: How to securely copy files between Linux hosts using SCP and SFTP ]
Maintaining a website
I first started using rsync
to synchronize a local version of a website I administered back in the dark ages when CI/CD was just a twinkle in her father's eye. I could keep a local copy to work on and also have a backup of the latest version of the site. I'll use that scenario as my example. You can use rsync
to synchronize any remote filesystem for backups or as a quick way to create a test-to-production pipeline. I've also used it to sync a directory and then used tar
to create local backups:
skipworthy ~ enable websync ls -al
total 8
drwxrwxr-x 2 skipworthy skipworthy 4096 Dec 16 13:57 .
drwxrwxr-x 5 skipworthy skipworthy 4096 Dec 16 14:01 ..
skipworthy ~ enable websync rsync -aruv 192.168.11.111:/usr/share/httpd/enable ./
receiving incremental file list
enable/
enable/bar
enable/foo
enable/index
sent 85 bytes received 229 bytes 209.33 bytes/sec
total size is 0 speedup is 0.00
skipworthy ~ enable websync ls -l
total 4
dr-xr-xr-x 2 skipworthy skipworthy 4096 Dec 16 13:49 enable
As before with the local rsync
, we're running in archive mode, preserving mtime and file attributes, recursing into subdirectories, and only updating data that is new or changed.
Note: -v
almost always means verbose, sending output to the console.
So let’s say I wanted to add a page to the site and upload it:
skipworthy ~ enable websync rsync -aruv ./* 192.168.11.111:/usr/share/httpd/enable
sending incremental file list
pagetwo
rsync: recv_generator: mkdir "/usr/share/httpd/enable/enable" failed: Permission denied (13)
This is a good time to note that there are some things you need to think about when using rsync
to push files. Rsync
needs permission to the whole directory tree, not just the destination directory. There are several ways you can accomplish this. For one, it's possible to specify the uid and gid of the rsync
daemon in /etc/rsyncd.conf
. Another way is to run rsync
as a user with the required permissions. Both of these can be problematic in a secure environment, so tread carefully here.
Also, note that by default, when you use rsync
remotely, you are connecting directly to the rsync
service on port 873. You need to think about that when setting up firewall rules and permissions. In addition, there are SELinux permissions, which are a whole other discussion and not within the scope of this article. One solution to the directory permissions problem and security concerns, in general, is to use SSH (this should sound familiar by now). SSH sets up an encrypted tunnel and can be set to listen on any port, not to mention that you can specify an SSH key to further secure the connection and make remote connections a bit more automation-friendly.
Advanced features
For this next example, I'm going to push these changes as root. Please don't be like me:
rsync -aruv -e ssh ./* root@192.168.11.111:/usr/share/httpd/enable
root@192.168.11.111's password:
sending incremental file list
pagetwo
sent 246 bytes received 36 bytes 43.38 bytes/sec
total size is 31 speedup is 0.11
Note again that the usual SSH options are available for connections, including specifying the port and key location. Rsync
also allows you to select the remote shell option, so long as it's installed on both ends and configured in .ssh/config
.
Checksums
I mentioned checksums earlier, and there are two potentially useful things here. First, rsync
runs a checksum by default and then verifies it on the target, which will warn you on the off chance of data getting lost or damaged in flight.
Second, you can also use checksums to determine which files to transfer (i.e., which files are actually different between source and destination), which is useful if you aren't sure if the mtime/atime represents the actual version of the file you want. I've had filesystems that I was trying to sync that were getting touched by another app, so the times were incorrect.
skipworthy ~ enable websync rsync -aruv -e ssh --checksum 192.168.11.111:/usr/share/httpd/enable ./
receiving incremental file list
sent 26 bytes received 364 bytes 780.00 bytes/sec
Note: This incurs some additional overhead in processing and in the transfer since a checksum is generated for each file on each side and then compared.
Compression
One more useful trick is compression. The -z
option will compress the stream, --zc
sets the compression type, and --zl
sets the level:
skipworthy ~ enable websync rsync -aruv -e ssh --zc=zlib --zl=6 192.168.11.111:/usr/share/httpd/enable ./
receiving incremental file list
sent 26 bytes received 248 bytes 548.00 bytes/sec
total size is 31 speedup is 0.11
If you don't specify the type or level, rsync
uses a list provided in the RSYNC_COMPRESS_LIST environment variable to negotiate for a common type and level of compression.
[ Looking for more on system automation? Get started with The Automated Enterprise, a free book from Red Hat. ]
Wrap up
So there you have it: rsync
is yet another one of those tools that is still useful and relevant to Linux systems administration—a tool I have been glad to have many times in my career. There are many, many more things you can do with it that we did not have room to explore here—as always, check the man pages!
Glen Newell
Glen Newell has been solving problems with technology for 20 years. As a Systems Engineer and administrator, he’s built and managed servers for Web Services, Healthcare, Finance, Education, and a wide variety of enterprise applications. More about me