Tarsnap - Online backups for the truly paranoid

Navigation menu

Improve the speed of some Tarsnap operations

Restore large archives faster

Tarsnap extract performance is currently latency-bound; the latency in question is client→EC2→S3, and the EC2→S3 step is about 50 ms.

The best workaround right now is to do parallel extracts; if you can split your data between multiple archives, or use --include and --exclude options when extracting so that each tarsnap -x is extracting a subset of the files, you should be able to use more bandwidth.

This process has been automated in at least one third-party tool.

Restore a single file faster

Tarsnap is built on top of the tar utility, which allows you to have multiple copies of the same file (to append an updated version to a tape disk). Due to this functionality, Tarsnap needs to scan the entire archive to see if there's other (later) copies of any file it is extracting.

If you have a large archive and know that you only have a single copy of the file that you wish to restore, we recommend that you use the --fast-read command-line option, which stops reading the archive as soon as it has extracted the file.

Delete multiple archives faster

Multiple archives can be deleted with the same command; this is usually faster (and never slower) than using individual delete commands:

tarsnap -d \
    -f mycomputer-2015-08-07_13-52-46 \
    -f mycomputer-2015-08-09_19-37-20 \
    -f mycomputer-2015-08-14_08-22-34

In particular, deleting multiple archives at once allows tarsnap to cache metadata rather than downloading it multiple times. The speed-up therefore depends on how similar the archives are; if the archives are completely different then it will not save any time.

For optimal cache performance, sort the list of archives so that archives which share a lot of their contents are deleted consecutively. In most cases, this is the same as sorting the archive names alphabetically.