Mar 012010
 

I spent some time thinking about backup strategy, and I decided for my purposes, I’d like to handle the staging process (getting all the files put together), and I’d like the backup solution itself to simply upload the files – but since I want to do nightly backups, I’d like the backup solution to have incremental capabilities.

I narrowed it down to two possible solutions – Tarsnap and Duplicity.  Both support incremental backups, both are command-line capable.  I decided to use Duplicity because it uploads directly to whichever back-end service you use – be it Amazon S3 or an SFTP server .  Tarsnap uses S3, but that’s your only option, and they do some processing for you, and because of that, it costs more.

Now, on to the details.

Getting Started

Standard disclaimer:  This is not at all supported by anyone and if you choose to try this, you’re doing so at your own risk.  This works fine for me, but your mileage may vary.  I am not in any way responsible for any costs this may incur to you, or any damage this may cause to you or your system(s).

Do NOT attempt to run any scripts you download from the internet without first fully understanding and testing them.  I have only tested this on my system, and I make no guarantees that it will work on your system – you may need to modify it to do so.

I welcome any feedback, of course, and if there’s enough interest, perhaps I can turn this into a project.

Requirements

The magic here is in the scripts that power this whole process.  Here are the things I wanted to (configurably) include in the backup process:

  • Dumps of all the file systems on the box
  • The output of bsdlabel (so I can put the partitions back together the same way)
  • /etc/fstab
  • root’s crontab (which I always keep in /root/crontab)
  • Custom directories outside of the main filesystems – in my case, certain locations within the /tank ZFS volume.
  • A way to easily and automatically exclude certain directories from the dumps (like /usr/src)

Background

My Filesystem Layout

My file systems are laid out as follows.  The /tank filesystem is a mirrored ZFS.  I don’t want to back up everything in /tank.

Dump(8) The File Systems

The dump(8) utility literally dumps individual file systems.  This is important – the standard FreeBSD configuration is to have the disk sliced into separate file systems – /, /tmp, /var, and /usr.  We’ll need to dump all of them individually.

For our purposes, we’re going to dump into a file.  Actually, we’re going to dump to stdout and pipe it to bzip2, and then redirect bzip2’s output to a file.  Here’s an example of the command:

If you’re an experienced UNIX/BSD/Linux user, you can probably figure out what that does, but I’ll break it down anyway:

dump:

  • -0: Dump level 0 – perform a full backup.  Dump allows you to specify different levels to do incremental backups.  The script below supports incremental backups because you pass the dump level on the command line.
  • -L: Tell dump that it’s dumping a live filesystem – this will cause dump to take a snapshot in time of the filesystem and then back up that snapshot.  This is important as the contents could be in a state of flux while the dump is running.
  • -a: Auto-size the dump file.  We’re not writing to a tape here.
  • -u: Update the contents of /etc/dumpdates.  This file keeps track of the last time each file system was dumped, so it knows what to include in the incrementals..
  • -f: Write the backup to a file.  In our case, we’ve specified “-” which means write the backup data to stdout.
  • /dev/ad4s1f: The file system we’re backing up.  On my system, this is /usr.

We then pipe (|) that output into the bzip2 utility, which would write the compressed data to stdout.  Since we want that all in a file, we then redirect (>) the bzip2 output to a file, which will get uploaded to S3 by the backup script and duplicity.

Install the Software

Dump is already on your system.  Duplicity is not, so you need to install it via the ports collection:

FreeBSD should, of course, get all the dependencies for you.

You also need the bash shell installed.

Configuration

The script consists of three files:  backup.sh, backup_vars.sh, and .security_vars.sh.  backup.sh and backup_vars.sh are below.

Basically, you configure variables in backup_vars.sh and .security_vars.sh.  I’ve included a sample backup_vars.sh below.  The main backup script tells you how to create .security_vars.sh if it doesn’t see one.

Testing

Once you’ve set up the scripts, you should test it.  Just run:

This should create a full dump of all the filesystems you selected and upload them to S3 (unless you specified NOUPLOAD on the command line).

Then, run:

This will create an incremental backup – just the things that have changed since the full backup (which shouldn’t be much), and it will upload those.

Here’s a small script that will show you the collection status (see the duplicity man page for more info on this):

Scheduling

I have it in the crontab to run in the early hours of the morning, every day.  On Sunday mornings, I run a level 0 (full backup), and every other morning I run a level 1.

Restoring Your System

To restore your system, you’ll need to download the files using duplicity.  See the duplicity man page on how to do that.  Once you retrieve the files, you can re-do your partitioning and use restore(8) to restore the dumps, and then put back any custom directories.

Code

These scripts are in their early stages.  They’re a bit messy.  Also, the syntax highlighting plugin I’m using seems to slightly mess up some indentation, but not to the point where it’s unreadable.   Finally – I’ll say it one more time – I can’t guarantee that this will work, so at the very least, it should be a starting point for you to design your own backup solution.

Here’s backup_vars.sh.  Configure everything here.

Here’s the actual backup script: