Tarsnap cryptography
When a system is first registered with the Tarsnap server, the tarsnap-keygen utility — as the name suggests — generates cryptographic keys for the system to use. These keys form the backbone of Tarsnap's security: Without them, it is (given the current state of the art of cryptography) infeasible for anyone (either an attacker, or you, should you lose your copy of the keys) to either decrypt archives or — equally important — create archives which will be recognized as valid.
The cryptographic keys used by Tarsnap include:
- A 2048-bit RSA key used for signing archives. This is used in combination with SHA256 and a Merkle hash tree to verify the authenticity of stored archives. (Cryptographers: RSASSA-PSS is used, with SHA256 as the hash function.)
- A 2048-bit RSA key used for encrypting session keys. All the data which the Tarsnap client sends to the server to store is encrypted with per-archive random AES-256 keys; those keys are encrypted with this RSA key and attached to the stored data. (Cryptographers: RSAES-OAEP is used to encrypt the session keys, using MGF1 and SHA256, and the padding verification performed when decrypting is carefully written to be free of timing side channels. The AES-256 keys are used in CTR mode, with sequentially incrementing nonces — generating a new AES-256 key for each session ensures that a key-nonce pair will never be used twice.)
- A 256-bit HMAC-SHA256 key used to protect each individual block of data from tampering. From a cryptographic perspective, this is unnecessary, since a Merkle hash tree protects each archive; but data is compressed using zlib before being stored, so this provides protection against a theoretical attacker who can tamper with stored data and has found a security flaw in zlib decoding. (The zlib decoding process is quite complex — for that matter, the same is true of all decompression algorithms — and such complex code has a relatively large risk of having security vulnerabilities, hence the decision to add an extra layer of security here.)
- Two 256-bit HMAC-SHA256 keys used to generate names for blocks of data stored. Tarsnap uses the same reference-by-hash trick as the author's Portsnap and FreeBSD Update utilities; using HMACs instead of raw SHA256 hashes prevents any information from leaking via the hashes. (Why two keys? One is used to hash data, and the other is used to hash archive names. Yes, that's right, even the names of archives are stored securely.)
- Three 256-bit HMAC-SHA256 keys used to sign requests sent by the Tarsnap client to the Tarsnap server: One used for writes, one used for reads, and one used for deletes. These are the only keys sent to the Tarsnap server by tarsnap-keygen.
The Tarsnap client-server protocol avoids the complexity of SSL by using a streamlined protocol which:
- Doesn't use certificates, since the Tarsnap server key is securely pre-distributed as part of the Tarsnap client.
- Only supports a single key exchange algorithm, equivalent to SSL using an RSA_DH certificate. (The Diffie-Hellman exchange is computed using the 2048-bit 'group 14' modulus.)
- Only supports a single set of cryptographic primitives (AES-256 in CTR mode for encryption; HMAC-SHA256 for authentication).