Package:Dirvish Backup

From Funtoo
Jump to navigation Jump to search

Dirvish Backup

   Tip

We welcome improvements to this page. To edit this page, Create a Funtoo account. Then log in and then click here to edit this page. See our editing guidelines to becoming a wiki-editing pro.

Introduction

Dirvish is an excellent automatic, no-nonsense, no gui needed way to backup your drives. Using simple text-based configuration files, you set it once and forget it.

Computer users fall into two groups:-

those that do backups

those that have never had a hard drive fail. -NeddySeagoon

But there are more reasons than hard drive failures Neddy. A momentary mental blip can delete an important file or a configuration or a custom program that has taken months to perfect. A simple copy from your nice safe backup can save a lot of heartache.

Dirvish saves images of partitions by date of backup. It uses hard links. In other words, the 1st initial backup is truly a 1:1 image of your drive. After that, backups of unchanged files are simply hard links to the original image. Only changed files are stored again. This means you can have about 3 weeks of daily backups of your partition (actual experience) with a backup drive about 2.5 times the size of that partition. Very efficient. And it is utterly transparent to the user. If you need to retrieve a file you won't know (or care!) if it's an actual file or a hard link. Directory listings look as though all of the files are truly there, for every date backed up.

Emerging the package

To install dirvish, run the following command:

root # emerge -av dirvish

Configuring

Backup Partition

The 1st thing you will want to do is dedicate a partition to be your backup drive. I suggest, as noted above, sized about 2-3 times the total size of all partitions you intend to backup. In this partition you will want to make subdirectories with meaningful names that YOU understand to refer to the partition they contain. For example if you were backing up your Funtoo laptop that had root, home and boot partitions you might want to make directories on the backup drive named laptop-root, laptop-home and laptop-boot. Each of these subdirectories will have a dirvish subdirectory in them that will contain a configuration file that controls what does/does not get backed up. So for this specific example this would be a quick and easy way to create all of these subdirectories:

root # cd /<mounted backup drive partition>
root # install -d laptop-boot/dirvish
root # install -d laptop-root/dirvish
root # install -d laptop-home/dirvish

There is no limit, so add the partitions of any and all computer partitions you wish to backup.

Master.conf Configuration

The fact that there is no gui for dirvish is actually a plus, there is nothing to learn or get confused about. The backups are controlled by one text file found in /etc/dirvish called master.conf (global settings) and one file in /<backup drive>/<meaningful partition name>/dirvish called dirvish.conf. Rather than boring you with a detailed explanation of /etc/dirvish/master.conf I will list a modified version of mine here and discuss it's important points afterwards.

   /etc/dirvish/master.conf - My dirvish master.conf file
# /etc/dirvish/master.conf:
runall:
#	funtoo-root 16:00
#	funtoo-home 16:00
#	funtoo-boot 16:00
	laptop-boot 16:00
	laptop-root 16:00
	laptop-home 16:00
	dockstar 16:00
	pogo1	16:00
	optware 16:00
#	disslowdog 16:00
bank:
	/mnt/auto/backup
image-default: %Y%m%d
log: gzip
index: gzip
exclude:
	/etc/mtab
	/var/lib/nfs/*tab
	/var/cache/apt/archives/*.deb
	/.kde/share/cache/*
	/.firefox/default/*/Cache/*
	/usr/src/**/*.o
	lost+found/
	*.iso
	*.avi
	*.mpeg
	*.mpg
	*.mov
	*.tmp

expire-default: +14 days
#expire-rule:
#       MIN HR    DOM MON       DOW  STRFTIME_FMT
#	*   *     *   *         1    +3 months
#	*   *     1-7 *         1    +1 year
#	*   *     1-7 1,4,7,10  1
#	*   10-19 *   *         *    +4 days
#	*   *     *   *         2-7  +15 days
   Important

Formatting in these config files IS IMPORTANT! Note that some items are indented. This is important! Dirvish will ignore them if you don't indent.

In the "runall:" section you list the partitions you want backed up. In dirvish these are called vaults. For example you see I have some partitions(vaults) commented out. That's one machine that's currently down (broken) and another machine (disslowdog) that I only backup a couple of times a week. Note the times given after each entry, 16:00. This is the image_time, the timestamp placed on each backup after it is created. I find it convenient to stamp them that way.

Next comes "bank:" This is the mounting you will use for your backup partition.

"Image_default:" is the default timestamp format that each day's backup will literally be called, as it is placed in your "meaningful-to-you" directories. For example, here's the actual listing of my disslowdog vault (boot, home and root all on one partition in this case):

root # ls /mnt/auto/backup/disslowdog
20140917  20140922  20140924  20140928  20141005  20141008  dirvish
root # ls /mnt/auto/backup/disslowdog/20141008
index.gz  log.gz  summary  tree
root # ls /mnt/auto/backup/disslowdog/20141008/tree
boot   etc   lib  opt   var
bin  home  net  root  sbin  usr

"log:" and "index:" simply tell it the compression method (if any) you want used for log and index files stored in the backup directories.

"exclude:"

you'll want to for example exclude any *.mov, *.iso, *.avi, etc. files. These types of files can be monstrously large, and they don't change from day to day, so it makes more sense to back them up separately from your daily backup, where it would slow things down daily.

"expire-default:" In the same sense that Funtoo is a rolling-release, this is a rolling-backup. In my case on the 15th day the original image is purged, dirvish completely takes care of this, along with moving actual images forward replacing hard links where necessary. So in this case, after 2 weeks of backups, there's always one oldest backup falling of the edge of the earth daily, to make room for newer backups.

"expire-rule:" As you can see, I don't use this function. It allows you to make a finer-grained adjustment with regards to the expiring of older backups. Quite sophisticated really, allowing you to do things such as keeping all Friday backups for 3 months.

There are a ton of other options, see man dirvish.conf to view all of them. The above will be sufficient for the average user I suspect.

Default.conf Configurations

Almost done!

Now you must create a default.conf file for each of your <meaningful-to-you>/dirvish directories (vaults), specific to each partition you are backing up. Let's look at one of my typical default.conf's:

   default.conf - a typical default.conf file
# default.conf:
client: <hostname of computer this partition is on>
tree: /
xdev: 0
rsh: ssh -i /root/.ssh/id_rsa <hostname>
index: gzip
image-default: %Y%m%d

exclude:	#indent!!!
	+ /usr
	+ /usr/src
	+ /usr/src/linux*
	+ /usr/src/linux*/.config
	boot
	home
	dev
	mnt
	proc
	run
	sys
	tmp
	media
	var/cache/man
	root/.ccache
	var/tmp/ccache
	var/lib/alsa
	var/lib/preload
	var/lib/upower
	var/lock/lvm
	etc/resolv.conf
	etc/NetworkManager/system-connections/Auto 114
	etc/lvm/cache/.cache
	etc/network/run/ifstate
	/.cache
	/.mozilla/firefox/*/Cache
	root/.bash_history
	root/.recently-used.xbel
	root/.local/share/Trash**
	var/tmp/**
	.trash/**
	usr/portage/**
	usr/src/**	
	root/.ccache
	root/.cache
   Important

Formatting in these config files IS IMPORTANT! Note that some items are indented. This is important! Dirvish will ignore them if you don't indent.

I shall describe this config file in the same way as the master.conf example.

"client:" This is the actual hostname of the computer containing the partition. Or IP address if you prefer. ssh will use this to connect to it.

"tree:" The base root tree of the partition you wish to back up. This is an example of the root partition of one of my computers. I always give home it's own partition, so if you used tree:/home/joe, that would be the base partition it would start from, you couldn't try to back up /home from there. Also any exclude patterns would be understood to start from that base.

"xdev:" This is a boolean true/false telling dirvish whether it's ok to cross mount points. I really don't think I use it anywhere.

"rsh:" This is the definition of the ssh command you would use to "ssh in" to that machine. Self explanatory I think.

"index:" and "image_default:" Already explained in the master.conf file, just an opportunity to fine tune them individually for each partition if desired.

"exclude:" Also previously described and self-explanatory. However, there are some things to note here: Here you have the opportunity for fine-grained control, obviously we are not going to backup dev, sys, and similar directories, but you probably don't want to put such things in your master.conf file, they are more appropriate here at the partition level.

Notice also that the first few excludes have a + before them. This makes them includes rather than excludes. Let's take the case of + /usr/src. In this case I want to backup the directory structure of /usr/src, so I make it an include. The directory(s) are created on the backup. But I really don't want anything in those directories, only the structure. So further down the exclude list (3rd from the bottom) I have usr/src/**. This is an exclude, there is no + preceding it. The double stars match anything, meaning anything in /usr/src does not get backed up.

Usage

Once you've created a default.conf file for each partition you want to backup you are done! Forever! OK, almost. You may wish to fine tune your configuration for the 1st couple of weeks, weeding out things like browser caches (careful though, sometimes they are needed if a complete reinstall is necessary), anything largish that doesn't change is a good candidate for weeding. Caches in general. And conversely, make sure you aren't excluding something that may be important.

You must create the initial backups by hand, it's painless though. Using our previous example of laptop-boot, laptop-root and laptop-home you do it like this:

root # dirvish --init --vault laptop-boot
root # dirvish --init --vault laptop-root
root # dirvish --init --vault laptop-home

From then on, there is a convenient script in /etc/dirvish called dirvish-cronjob. It's just what the name sounds like, call it from your cron daemon whenever you want your partitions backed up. For example you could put this in your fcrontab:

%daily 15 03 /etc/dirvish/dirvish-cronjob

And voila! While you sleep your partitions are backed up. Assuming you are asleep at 3:15 in the morning.

That should get you started. There is much good info in dirvish.conf and also at some of the links at http://www.dirvish.org/. Here we've only briefly touched on the configuration.

Security

You may wish to consider the implications of allowing root ssh logins. Most experts recommend turning off this ability, i.e. set PermitRootLogin no in /etc/ssh/sshd_config. This creates a problem for dirvish, because it DOES login as root through ssh. However there is a secure and (somewhat) easy way of circumventing this issue. This is done by setting PermitRootLogin forced-commands-only in /etc/sshd_config. It will allow ONE command per key to be executed as root on the client computer (the one being backed up). This gives you the best of both worlds, only the ONE command with the ONE key can execute as root, any other attempted root login will fail.

You are going to intentionally cause dirvish to fail to log in in order to find out the command needed. No need to comprehend, just follow the steps ;)

1. Go ahead and set PermitRootLogin forced-commands-only in /etc/ssh/sshd_config and restart sshd (/etc/init.d/sshd restart) on the client (computer you wish to log into)
2. Temporarily change the "rsh:" statement in the <meaningful to you>/dirvish/default.conf file for the computer being backed up to "rsh:/tmp/ssh" (no quotes)
root # rsh: /tmp/ssh
3. Now create a temporary ssh script on the SERVER to find out the correct rsync command to be run on the client.
root # echo -e '#!/bin/sh\necho $@ > /tmp/rsync' > /tmp/ssh
root # chmod +x /tmp/ssh
4. Run Dirvish so we can find the command (it will surely fail in a few seconds) and then print out the results and copy to your clipboard.
root # dirvish --vault <client vault> --init
root # cat /tmp/rsync

You will see something similar to this:

root # client rsync --server --sender -vlHogDtprx --numeric-ids . /
5. Now edit /root/.ssh/authorized_keys of the CLIENT machine. Paste this in, replacing the rsync error with the one YOU got.
   /root/.ssh/authorized_keys - a forced command line entry
command="<YOUR rsync error>",from="<IP Address or hostname of server>",no-port-forwarding,no-X11-forwarding,no-agent-forwarding ssh-rsa <YOUR big long rsa key from the server>

For further clarity, it should look similar to this:

   /root/.ssh/authorized_keys - an almost real entry with a throw-away key
command="rsync --server --sender -vlHogDtpre.iLsf --numeric-ids . /",from="mysuperdeluxecomputer",no-port-forwarding,no-X11-forwarding,no-agent-forwarding ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCgFtKBHW5+RZm3e2yj6LnbWC/ZEnFouvDXhbEDnnKO2mdVmJjldut7InPwhfjB8aJszLdCT7V1/fypp4us8HXiOYYU/PWmUbGslNTbuVqDZ9o1xlpjWaanmOllgi0vCPe6ZiifE70HPygvHlAEO8M+sMUmIWmVdOin9u8owExoAYD/SK+xLJR2sTunrUZ1kg0phWHVoCTBnzncIW0lBQZARuU2DD/YdIZOIvYnKjLCYz2Qs/7bN/lw8yMw2nuwquVoJqGSXoX6lzrTwCgMXK93CUEvdIbWtbs6qp5q0Ige2d/1d0WB4MB9vFZMXvAVjfHDnMulKPBt9JqtVXV1Uicf root@mysuperdeluxecomputer

6. Change the "rsh:" statement in the <meaningful to you>/dirvish/default.conf file for the computer being backed up back to "rsh:ssh" (no quotes)

root # rsh: ssh

And that's it. You should test it with dirvish --init vault <client vault>. You'll have to repeat these steps with every machine that has PermitRootLogin set to forced-commands-only.

   Note

If this is the only root login you use you are good to go, just use your (for example) /root/.ssh/id_rsa.pub key. If you do other root logins from that machine you must create an individual key for each task. Also, if you create a specially named key for this task be sure and change the <vault>/dirvish/dirvish.conf file rsh: variable to reflect the proper key name

Credit for the info in the security section: [1] -- step 5

Dirvish-Cronjob Modifications

While the vanilla dirvish-cronjob script works, it's just a bash script and intentionally made to be modified. So don't be afraid to customize it to your liking. I present mine here, I've been tweaking on it for a few years and it works well for me.

root #!/bin/bash
root #
root # daily cron job for the dirvish package
root #
root #BANK="/mnt/auto/backup"
root # this line saves having to list the location of the backup mount twice, now only in /etc/dirvish/master.conf
BANK=$(sed -n '/bank/{n;n;p;}' /etc/dirvish/master.conf|sed -e 's/^[ \t]*//')
START="$(date +%Y%m%d)"

if [ ! -x /usr/sbin/dirvish-expire  ]; then exit 0; fi
if [ ! -s /etc/dirvish/master.conf ]; then exit 0; fi

root # check that backup drive is ready
if [ ! -d "$BANK" ]
	then
	su sputnik -c 'ssh phoenix echo "Backup directory is missing"|festival --tts'
root #	exit 1
	#hey man, where's the email if no backup BANK?  try this:
	exit $?
fi
root # The meat & taters
/usr/sbin/dirvish-expire --quiet
/usr/sbin/dirvish-runall --quiet
root # check for bum backups due to rsync
root # this positively notifies me if there is a problem with any backups
root # the standard dirvish email isn't so clear about it
root # nothing worse than thinking all's well and it's not, just when you need it
VAULT_RSYNC_ERROR=""
for VAULT in $(ls -x "$BANK")
do
	if [ -e "$BANK/$VAULT/$START/rsync_error" ]; then
		VAULT_RSYNC_ERROR="${VAULT_RSYNC_ERROR}${VAULT}\n"
	fi
done
if [ -n "$VAULT_RSYNC_ERROR" ]; then
	# we gots to send somebody email!
	echo -e "The following dirvish vaults had rsync problems.\nNo backups of these vaults were made.\n\
	$VAULT_RSYNC_ERROR on $START" 2>&1 | sed '1!b;s/^/To: someemailaddy@somewhere.com
	Subject: rsync errors-some backups were faulty\n\n/' | sendmail -t
fi
root # keep local backup copies of vault configurations, a real bummer to lose those
for VAULT in $(ls -x "$BANK")
do
	if [ -e $BANK/$VAULT/dirvish/default.conf ] && [ -e "/etc/dirvish/config_backups/$VAULT/dirvish" ]; then
		cp $BANK/$VAULT/dirvish/default.conf "/etc/dirvish/config_backups/$VAULT/dirvish"
	fi
done
exit $?