Setting up a ZFS backup with Syncoid, discovering why it's slow

For many years I’ve been running a NAS with ZFS to store my files. Some time ago, I decided to stop being lazy and set up a backup ZFS system, because RAID is not a backup. For now, the backup machine is sitting a few feet away from the ZFS NAS, which doesn’t really protect me against some catastrophic scenarios, but at least it’s a start. At some point I’ll figure out a way to have an off-site backup solution that I’m comfortable with (privacy, price, easy to use are things I care about).

Well, enough introduction, I just want to tell two short stories. The first one: how to get syncoid running in a way that’s acceptable for me. Syncoid is pretty great to sync ZFS datasets and snapshots between machines. While looking for a solution to that, I didn’t find anything easy enough and that handled as many edge cases as syncoid does, so I decided to use it. The catch is that if you try to just sync a ZFS dataset between two machines, something like syncoid pool/dataset user@remote:pool/dataset, you’ll eventually see syncoid throwing a sudo error: “sudo: no tty present and no askpass program specified”. That’s because it’s trying to run a sudo command on the remote, and sudo doesn’t have a way to ask for a password with the way syncoid’s running commands in the remote.

Searching online, I found many people just saying to enable SSH as root, which might be fine on a local network, but I don’t really like this. Instead, I’m more comfortable just enabling passwordless sudo for zfs commands on my user. Getting this done was very simple:

sudo visudo /etc/sudoers.d/zfs_receive_for_syncoid

And then fill it with the following:

<your user> ALL=NOPASSWD: /usr/sbin/zfs *

If you really want to put in the effort, you can even take a look at which zfs commands that syncoid is actually invoking, and then restrict passwordless sudo only for those commands. It’s important that you do this for all commands that syncoid uses. Syncoid runs a few zfs commands with sudo to list snapshots and get some other information on the remote machine before doing the transfer. I had initially limited passwordless sudo only for zfs receive *, and spent quite some time to figure out why syncoid was always trying to sync from the first snapshot - in reality it just wasn’t able to list snapshots on the remote machine, so it thought that there were none!

Well, after all of this fun, I noticed that the transfer speeds were really low, nearing 11MiB/s. My machines are somewhat old, but not that old that they can’t handle gigabit ethernet, so I decided to investigate.

I ran iperf -s in one of the machines, and iperf -c <remote ip> -d on the other machine to check if this was a networking problem, or some other problem (syncoid does some compression and buffering to try to make things faster, so there could be something going on there). To my surprise, I got close to 100MiB/s in one direction (from the remote machine to the ZFS NAS), and about 20MiB/s in the other direction. Looks network-related. I ran ethtool on both ends to check if there was anything weird going on, and surely enough, the remote machine reports a speed of 100Mb/s, while the ZFS NAS reports 1000Mb/s. To quickly confirm my theory of a bad cable, I checked my router, which helpfully lights an extra LED when a link is gigabit. There was only one LED coming from the remote machine, so that was that. Replaced the cable with a different one, and suddenly I had 6 to 7 times faster transfer speeds. Yay!

That’s pretty much it for this post, just wanted to tell those two small stories. syncoid is still syncing the entire dataset to the other machine, but from what I’ve seen, looks like I’ll be a happy user of this tool. I’ve been thinking about investigating Nix and NixOS and eventually migrate these two ZFS machines (which are currently on Ubuntu) to NixOS, and make my life easier in the future whenever I need to set things up in another machine. Nix and NixOS kind of remind me of the Yocto project, something I’ve worked with many years ago when developing firmware for some devices. I really enjoyed Yocto, it was likely one of the first open source projects that I thought was really well-polished. I might make a post about Nix and NixOS in the future if/when I get to explore it some more.