I previously discussed configuring JungleDisk on FreeBSD. It’s not quite the easiest to install since FreeBSD isn’t officially supported. To take that a step further, I’m now going to show what I do to back up my FreeBSD box at home.
Update, November 2009: I am no longer using JungleDisk to back up my FreeBSD box. Jungledisk recently released version 3.0 of their software which does not include a command-line Linux version in the standard desktop edition. I was advised to stick with the old version if I want to continue backing up. Instead, I chose to change over to Duplicity. I will write a post on Duplicity in the near future.
There are a couple of steps to this process. First, we must perform the backup itself. I’m using dump(8) for this purpose – this program is built right into FreeBSD – it’s purpose in the original UNIX was to dump a file system to a tape drive, but we’re going to use it to dump the filesystem to a file. The second step is to have JungleDisk back the files up to S3.
Standard disclaimer: This is not at all supported by JungleDisk and if you choose to try this, you’re doing so at your own risk. This works fine for me, but your mileage may vary. I am not in any way responsible for any costs this may incur to you, or any damage this may cause.
Filesystem Layout
Let’s talk about my FreeBSD box. It’s primary purpose is for network-attached storage, and for that purpose I have a ZFS filesystem mounted on /tank. Other than that, it’s pretty standard. Here’s my “df -h” output, for reference:
[root@darkhelmet ~]# df -h
Filesystem Size Used Avail Capacity Mounted on
/dev/ad4s1a 496M 423M 33M 93% /
devfs 1.0K 1.0K 0B 100% /dev
/dev/ad4s1e 989M 2.4M 908M 0% /tmp
/dev/ad4s1f 101G 3.1G 90G 3% /usr
/dev/ad4s1d 1.9G 322M 1.5G 18% /var
Dump(8) The File Systems
The dump(8) utility literally dumps individual file systems. This is important – the standard FreeBSD configuration is to have the disk sliced into separate file systems – /, /tmp, /var, and /usr. We’ll need to dump all of them individually.
For our purposes, we’re going to dump into a file. Actually, we’re going to dump to stdout and pipe it to gzip, and then redirect gzip’s output to a file. Here’s the command:
dump -0Lauf - /dev/ad4s1f | gzip > /tank/backup/darkhelmet/dumps/usr.dump.gz
If you’re an experienced UNIX/BSD/Linux user, you can probably figure out what that does, but I’ll break it down in case you’re more of a novice:
dump:
- -0: Dump level 0 – perform a full backup. Dump allows you to specify different levels to do incremental backups. I’m not going to do incremental backups at this time, so we’ll always leave the dump level at 0.
- -L: Tell dump that it’s dumping a live filesystem – this will cause dump to take a snapshot in time of the filesystem and then back up that snapshot. This is important as the contents could be in a state of flux while the dump is running.
- -a: Auto-size the dump file. We’re not writing to a tape here.
- -u: Update the contents of /etc/dumpdates. This file keeps track of the last time each file system was dumped, in case you want to start doing incremental backups.
- -f: Write the backup to a file. In our case, we’ve specified “-” which means write the backup data to stdout.
- /dev/ad4s1f: The file system we’re backing up. On my system, this is /usr.
We then pipe (|) that output into the gzip utility, which would write the compressed data to stdout. Since we want that all in a file, we then redirect (>) the gzip output to a file.
When that command is run, I end up with a file called usr.dump.gz in /tank/backup/darkhelmet/dumps, and that will be the file I will have JungleDisk back up to S3.
Once again, it’s important to note that you must dump each filesystem individually.
Script it Out
I wrote a simple shell script to loop through all of my filesystems and dump each of them. I probably could have automated it more by parsing the output of a mount command, but I didn’t want to get too complicated with it.
#!/bin/sh
FSLIST="/dev/ad4s1a=root /dev/ad4s1f=usr /dev/ad4s1d=var"
DUMPDIR="/tank/backup/darkhelmet/dumps"
for FSITEM in ${FSLIST}; do
FS=`echo ${FSITEM} | awk -F= '{ print $1 }'`
NAME=`echo ${FSITEM} | awk -F= '{ print $2 }'`
echo "FS: ${FS}"
echo "NAME: ${NAME}"
echo "dump -0Lauf - ${FS} | gzip > ${DUMPDIR}/${NAME}.dump.gz"
dump -0Lauf - ${FS} | gzip > ${DUMPDIR}/${NAME}.dump.gz
done
I wrote this script to be easily configurable. The FSLIST variable contains a space-separated list of the filesystems I want to back up and their names, in a “key=value” type of list. Then specify the DUMPDIR variable to tell the script where to put your dump files.
We then loop through the ${FSLIST} variable, and use awk to separate the values and get the file system into the ${FS} variable and the name into the ${NAME} variable. Finally, we use those two variables, along with the ${DUMPDIR} variable, to construct our command lines.
On my system, this script basically runs the following 3 commands:
dump -0Lauf - /dev/ad4s1a | gzip > /tank/backup/darkhelmet/dumps/root.dump.gz
dump -0Lauf - /dev/ad4s1f | gzip > /tank/backup/darkhelmet/dumps/usr.dump.gz
dump -0Lauf - /dev/ad4s1d | gzip > /tank/backup/darkhelmet/dumps/var.dump.gz
Here’s the full output of this script:
[root@darkhelmet /tank/backup/darkhelmet]# ./dh_backup.sh
FS: /dev/ad4s1a
NAME: root
dump -0Lauf - /dev/ad4s1a | gzip > /tank/backup/darkhelmet/dumps/root.dump.gz
DUMP: Date of this level 0 dump: Sun Mar 1 16:54:35 2009
DUMP: Date of last level 0 dump: the epoch
DUMP: Dumping snapshot of /dev/ad4s1a (/) to standard output
DUMP: mapping (Pass I) [regular files]
DUMP: mapping (Pass II) [directories]
DUMP: estimated 427200 tape blocks.
DUMP: dumping (Pass III) [directories]
DUMP: dumping (Pass IV) [regular files]
DUMP: DUMP: 427199 tape blocks
DUMP: finished in 67 seconds, throughput 6376 KBytes/sec
DUMP: level 0 dump on Sun Mar 1 16:54:35 2009
DUMP: DUMP IS DONE
FS: /dev/ad4s1f
NAME: usr
dump -0Lauf - /dev/ad4s1f | gzip > /tank/backup/darkhelmet/dumps/usr.dump.gz
DUMP: Date of this level 0 dump: Sun Mar 1 16:57:06 2009
DUMP: Date of last level 0 dump: the epoch
DUMP: Dumping snapshot of /dev/ad4s1f (/usr) to standard output
DUMP: mapping (Pass I) [regular files]
DUMP: mapping (Pass II) [directories]
DUMP: estimated 3147647 tape blocks.
DUMP: dumping (Pass III) [directories]
DUMP: dumping (Pass IV) [regular files]
DUMP: 34.90% done, finished in 0:09 at Sun Mar 1 17:11:27 2009
DUMP: 77.39% done, finished in 0:02 at Sun Mar 1 17:10:03 2009
DUMP: DUMP: 3148664 tape blocks
DUMP: finished in 797 seconds, throughput 3950 KBytes/sec
DUMP: level 0 dump on Sun Mar 1 16:57:06 2009
DUMP: DUMP IS DONE
FS: /dev/ad4s1d
NAME: var
dump -0Lauf - /dev/ad4s1d | gzip > /tank/backup/darkhelmet/dumps/var.dump.gz
DUMP: Date of this level 0 dump: Sun Mar 1 17:11:16 2009
DUMP: Date of last level 0 dump: the epoch
DUMP: Dumping snapshot of /dev/ad4s1d (/var) to standard output
DUMP: mapping (Pass I) [regular files]
DUMP: mapping (Pass II) [directories]
DUMP: estimated 338748 tape blocks.
DUMP: dumping (Pass III) [directories]
DUMP: dumping (Pass IV) [regular files]
DUMP: DUMP: 338680 tape blocks
DUMP: finished in 174 seconds, throughput 1946 KBytes/sec
DUMP: level 0 dump on Sun Mar 1 17:11:16 2009
DUMP: DUMP IS DONE
[root@darkhelmet /tank/backup/darkhelmet]#
As you can see, we get some pretty detailed output from our dump commands. Here are the files I have in my dump directory:
[root@darkhelmet /tank/backup/darkhelmet/dumps]# ls -alh
total 1532535
drwxr-xr-x 2 root dave 5B Mar 1 16:03 .
drwxr-xr-x 4 dave dave 5B Mar 1 16:54 ..
-rw-r--r-- 1 root dave 127M Mar 1 16:55 root.dump.gz
-rw-r--r-- 1 root dave 1.1G Mar 1 17:10 usr.dump.gz
-rw-r--r-- 1 root dave 230M Mar 1 17:14 var.dump.gz
[root@darkhelmet /tank/backup/darkhelmet/dumps]#
I now have my entire FreeBSD system (except /tmp and /dev) completely backed up into three nice (relatively small) files.
Upload to S3
The next step is to upload them using JungleDisk. Remember, when running under FreeBSD’s Linux binary interface, JungleDisk will see directories relative to /usr/compat/linux. So in my jungledisk-settings.xml, I have configured my backup directory to be /backups. In the real world, that’s /usr/compat/linux/backups. However, for some reason, it uses the real filesystem layout when you specify the configuration file on the command line. I haven’t figured this one out yet, but hey, it works!
I’ve added to my shell script to copy the dump files over to the /usr/compat/linux/backups directory. You might instead consider having the dumps go directly into that directory.
Set set up your copy of jungledisk using the tutorial in my JungleDisk on FreeBSD post, and copy the jungledisk-settings.xml to the BSD box.
Add the JungleDisk command to your shell script:
cd /path/to/jungledisk/binary
./jungledisk -o config=/tank/backup/jungledisk/jungledisk-settings.xml --startbackups -f --exit -d
Of course, feel free to tweak the command-line options. Here’s what this command line does:
- -o config=/tank/backup/jungledisk/jungledisk-settings.xml: Points JungleDisk to it’s XML configuration file.
- –startbackups: Start all backups in the configuration file immediately.
- -f: Stay in the foreground – I haven’t played with daemonizing JungleDisk, and I’m going to run this shell script from the cron tab, so no need to fork into the background.
- –exit: Exit when idle. This will cause JungleDisk to exit out a few moments after the backup completes.
- -d: Enable debugging. Most of this information is useless as far as I’m concerned, but I don’t mind seeing it.
Notes
You should probably include in your script a command to output the bsdlabel of your disks to a text file that will be uploaded along with your dump files. To do this:
bsdlabel [disk] > /path/to/bsdlabel.txt
On my system, it is:
bsdlabel ad4s1 > /tank/backup/darkhelmet/dumps/bsdlabel.txt
Additionally, it’s a good idea to make a copy of your /etc/fstab in that directory as well. You’ll need the bsdlabel output to make sure you’re restoring the dumps to the proper partitions, and the backup of the fstab just for added peace of mind.
Schedule It
The last step is to put this script into a crontab. To keep costs down, I’m not running mine very often – I currently have it set up to run once a week, but I may even change that to once every other week. My box doesn’t change that much, so there’s no need to constantly have up-to-date backups.
Restoring Your System
To restore your system, you’ll need to boot the system in single-user mode (from the hard disk if possible, or from the install or rescue CD, or a custom boot disk, whatever you want). You’ll also need to download a copy of your dump files. My plan is to use another machine to download the dump files, and I will place them on a USB storage device.
For each file system (I’ll show an example using my / filesystem) do the following:
Format the filesystem:
newfs -U /dev/ad4s1a
Mount the new filesystem:
mkdir /mnt/newfs
mount /dev/ad4s1a /mnt/newfs
Mount the USB drive:
mkdir /mnt/usb
mount -t msdosfs /dev/da0s1 /mnt/usb
cd to the partition being restored:
cd /mnt/newfs
Restore our backup:
gzcat /mnt/usb/root.dump.gz | restore -rf -
Un-mount the filesystem
cd /
umount /mnt/newfs
umount /mnt/usb
Finally, reboot
shutdown -r now
Note: I haven’t yet tried this procedure, and I give credit to these steps to a post on the FreeBSD forum.
Next Steps
Once you’ve got this procedure set up, you may want to start tweaking. I plan on playing with dump(8)’s ability to do incremental backups, as this should save on my S3 bandwidth costs. As it stands right now, I’ll be uploading a fresh copy of the backup every time, which currently stands at about 1.5GB. Now, with S3, it only ends up being about $0.15 each time, which would cost me about $0.60 per month for me if I schedule the backup weekly. If you have lots of users on your box or applications that store lots of data, it might be better for you to perform incremental backups to save on your transfer costs.
If you have other ideas or if I missed anything, please feel free to leave a comment!
Extra Reading
http://www.freebsd.org/doc/en/books/handbook/backup-basics.html