Replicating a ZFS dataset to another host.

#Backups #FreeBSD #ZFS

I recently wrote a post called High Availability, RAID10, and 3-2-1 Backups with FreeBSD. There’s benefits to running the setup described within but it could be overkill for some uses. I don’t use it for everything. Sometimes all that’s needed is an incremental ZFS send/recv in a crontab. Admittedly, the previous post violates KISS (although, we didn’t invoke a package manager in the previous post — and we won’t do so here, either).

You

Have two FreeBSD installations with ZFS enabled. You wish to replicate a ZFS dataset on an origin host to a target host, hourly. You have a non-root user on each server. You use this user to SSH into the given host.

How

First we’ll give our non-root users permissions for the specific ZFS actions we’ll take. On each host…

zfs allow -u $YOURUSER send,snapshot,hold,compression,mountpoint,create,mount,receive $YOURDATASET

Next, we’ll fire off the “seed” dataset snapshot. This could be very large if you’re starting from an existing dataset. You might not want to move the initial snapshot to the target host on the Internet. I used an external drive. Either way, take the snapshot, send the snapshot, receive the snapshot. This is an example of using SSH.

zfs snap -r $YOURDATASET@seed
zfs send -Rcwv $YOURDATASET@seed | ssh $YOURUSER@$YOURHOST zfs recv -vdFu $YOURDATASET

Make sure the dataset exists on the target host, though… Prior to the above:

ssh $YOURUSER@$YOURHOST zfs create $YOURDATASET

We’ll now create the script a cron we’ll create later will use.

#!/bin/sh

# /home/$YOURUSER/bin/incremental_backup

set -e # halt script on non-zero exit

PREV=$(ssh $YOURUSER@$YOURHOST zfs list -t snap | grep '$YOURDATASET@' | tail -n 1 | awk '{print$1}' | awk -F '@' '{print$2}')
NEXT=$(date +'%s')
FILENAME=$(printf "/tmp/snap_%s_%s.zfs" $PREV $NEXT)

zfs snap -r $YOURDATASET@$NEXT
zfs send -Rcwv -I $YOURDATASET@$PREV $YOURDATASET@$NEXT > $FILENAME

# retry rsync command up to three times.
set +e # rsync may fail; don't exit if so.
RSYNC_STATUS=1
RETRYCOUNT=0
RETRYMAX=3
while [ $RSYNC_STATUS != 0 ] && [ $RETRYCOUNT -lt $RETRYMAX ]; do
        RETRYCOUNT=$((RETRYCOUNT+1))
        rsync -P -e ssh $FILENAME $YOURUSER@$YOURHOST:/tmp/
        RSYNC_STATUS=$?
done
set -e

ssh $YOURUSER@$YOURHOST "zfs recv -vdFu $YOURDATASET < $FILENAME"
ssh $YOURUSER@$YOURHOST "rm $FILENAME"
rm $FILENAME

It would be a good idea to test this script before continuing.

This script queries the target host to obtain the latest snapshot of the target dataset it has. A snapshot on the origin host is created. The origin host creates an incremental stream from the latest snapshot of the target host to the snapshot just created on the origin host. The incremental stream is sent to a file within /tmp/ on the origin host. This file is copied to the target host via rsync (rsync must exist on both hosts) with a max retry of three attempts.

Since this file is temporarily placed in /tmp/ it’s a good idea to have encryption enabled on your dataset(s).

Let’s cron it. On the origin host under your non-root user:

$ crontab -e
SHELL=/bin/sh
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin
0       *       *       *       *       /usr/home/$YOURUSER/bin/incremental_backup

This cron will run every hour. Resulting in an hourly snapshot being sent to the target host.

EOF