Here are some tips:
- Including and excluding files
- What to back up
- Don't use your own compression
- How to back up "live" files and filesystems
- Printing statistics for all archives
- Copying an archive
- Write-only keys
- Checking which file Tarsnap is processing
- Cleanly stopping a Tarsnap upload
- Receiving emails from creating backups
Including and excluding files
You may select which files to be included in backups, either on a global basis or a per-directory / per-file basis.
What to back up
Tarsnap can be used to back up your entire system by pointing it to /, but this can lead to unnecessarily large backups. We can't tell you exactly what to back up, but we can give you some questions and advice to consider.
The fundamental question is: "if I lost this data, how much effort would it take to recover it?"
Operating system files
- If you are running a standard operating system like Ubuntu x.y or FreeBSD z.w, then it might not be necessary to back up those files — they are easy to reinstall.
- On the other hand, if you are running a heavily-tweaked operating system with a great deal of manually-installed software, then perhaps it would be a good idea to back up everything — although this will incur a large initial archive, future archives will be much smaller thanks to deduplication.
By their very definition, temporary files are not worth backing up.
However, simply excluding *tmp* may cause problems; for
example, it would affect tmpfile_handler.c and
webapp_add_user.tmpl. We therefore recommend:
There are a few other common patterns for files and directories that
are likely not worth backing up: memory dumps, the OSX user cache
directory, and the GNOME virtual filesystem:
exclude *.core exclude /Users/*/Library/Cache exclude /home/*/.gvfs/
Don't use your own compression
Don't apply any compression (gzip, bzip2, zip, tar.gz, etc.) to your data — Tarsnap itself will compress data after it performs deduplication. If you compress a file, it will interfere with deduplication, meaning that if you change that file and re-compress it, Tarsnap will upload (almost) all of that compressed file again. It is more efficient (and saves you money!) to let Tarsnap handle the deduplication and compression.
GNU gzip has a
--rsyncable option which attempts to
compress a file while retaining some amount of deduplication ability.
not as efficient
as using Tarsnap's normal deduplication and compression, but if you
must use your own compression for files which are likely to be
--rsyncable could lessen the increase in
storage space that you would otherwise incur.
How to back up "live" files and filesystems
For normal home use, Tarsnap needs no special handling. But when used on a production server, there are a few additional considerations.
Backing up a database
- Don't attempt to back up a "live" database. Use your database software to create a static dump (ideally a text file).
- Don't compress the database dump. Tarsnap's deduplication (followed by its compression) is more efficient than attempting to compress the database dump directly.
Backing up a live filesystem snapshot
A filesystem snapshot is a copy of a filesystem at a particular time. This allows an "atomic" archive to be created, even if the filesystem modifies files while Tarsnap is running. For an example of the problems which can arise from "non-atomic" archives, consider the following directory layout:
a/ b/lots-of-data.dat c/important-file.txt
Suppose that Tarsnap processed the directories in the order
c; however, while Tarsnap
b, the user moved
a. When Tarsnap
important-file.txt was not in
that directory, so
important-file.txt was not archived at
all! A similar scenario (in which Tarsnap processed the directories
in reverse order) could result in
being archived in both directories. These concerns can apply to a
single file being modified in place — we could end up with half
of an old version of the file, and half of the new version.
In order to avoid these problems, filesystem snapshots were developed in the 1990s. Given a normal read-write (RW) filesystem, a filesystem (FS) snapshot is created as a read-only (RO) filesystem, then deleted once the archive is created:
filesystem (RW) --+--> use as normal | (create FS snapshot) | +--> FS snapshot (RO) --> run tarsnap --> delete FS snapshot
Please consult the documentation for your operating system or filesystem management software to learn how to create and delete filesystem snapshots.
There is a potential race condition when backing up from a filesystem
snapshot which could result in Tarsnap being unable to detect some
modifications to a file. If you would like to run Tarsnap on
filesystem snapshots, please read about the
Printing statistics for all archives
You can print statistics about all archives with:
tarsnap --print-stats -f '*'
Copying an archive
If you wish to create an identical copy of an archive (for example,
having identical archives named
backup-2016-01-01-monthly), we recommend that you
tarsnap -c -f backup-2016-01-01-weekly @@backup-2016-01-01-daily
This will download some metadata (the list
of blocks in
backup-2016-01-01-daily), but this is a
relatively small amount of bandwidth.
Alternatively, you could simply create a second archive right after creating the first one; our deduplication algorithm will ensure that no data will be uploaded unnecessarily. This avoids downloading metadata, but if your files have changed between creating the first and second (or subsequent) archives, then your "copy" would not be an exact copy.
tarsnap-keymgmt creates new keys with (optionally) reduced permissions. In particular, it can create "write-only" keys:
tarsnap-keymgmt --outkeyfile write-only-key.txt -w ~/tarsnap-main-key-file.txt
This allows you to create a write-only key (without a passphrase) which is used to create archives automatically, and a key with more permissions (requiring a passphrase) which is used for restoring or deleting archives. In this system, if an intruder breaks into your server, she would be able to halt your backups (or add new archives with faulty data), but not delete your existing archives.
Checking which file Tarsnap is processing
tarsnap binary responds to the
POSIX signal (and
SIGINFO on platforms which support it)
by printing the current file. For example, running this command in
tarsnap --dry-run -c ~/src/tarsnap
Then entering this command in a different terminal:
killall -SIGUSR1 tarsnap
Will produce output similar to this in the first terminal:
adding home/td/src/tarsnap/build/tarsnap-keygen (196608 / 637689 bytes)
On BSD systems (including OS X), a
SIGINFO can be sent to
the active terminal by pressing
Cleanly stopping a Tarsnap upload
If you use ^C (CTRL-C) to stop Tarsnap uploading a new archive, you will lose progress back to the last checkpoint. To stop cleanly, use ^Q (CTRL-Q) or send the SIGQUIT signal to tell Tarsnap to create a truncated archive. The truncated archive will have ".part" appended to its name.
Receiving emails from making backups
If you are running the /root/tarsnap-backup.sh backup script described in Simple usage, then you may wish to modify it to send you an email with the status, particularly if you are running it automatically with cron:
#!/bin/sh # User variables firstname.lastname@example.org tarsnap_output_filename=/tmp/tarsnap-output-temporary.log # Run backup tarsnap -c \ -f "$(uname -n)-$(date +%Y-%m-%d_%H-%M-%S)" \ --fake-option \ /MY/DATADIR >$tarsnap_output_filename 2>&1 # Send email if [ $? -eq 0 ]; then subject="Tarsnap backup success" else subject="Tarsnap backup FAILURE" fi mail -s "$subject" $email < $tarsnap_output_filename rm $tarsnap_output_filename
Naturally, you will want to modify the email variable and the /MY/DATADIR directory name(s).
Setting up command-line
ssmtp. If your system does not have an MTA already set up, then we recommend trying
ssmtp(also known as sSMTP), which is an extremely simple MTA and thus is easier to configure.
There are many other options available with tarsnap. All of the information on this page, and more, can be found in the man pages.