The kivaloo data store
Kivaloo (pronounced "kee-va-lieu") is a collection of utilities which together form a data store associating keys of up to 255 bytes with values of up to 255 bytes. It was designed to satisfy the needs of the Tarsnap online backup service for high-performance key-value storage, although it is not yet being used for that purpose.
At present, kivaloo comprises a block store (lbs) providing log-structured storage within a local filesystem; a key-value store (kvlds) which manages a log-structured B+Tree and services requests upon it from a single connection; and a request multiplexer (mux) which accepts multiple connections and routes requests and responses to and from a single "upstream" connection.
It is likely that other components will be added in the future to add more features (e.g., replication) or provide alternatives (e.g., other forms of underlying storage).
Kivaloo has some features which distinguish it from most other "nosql" data stores:
- Kivaloo is durable: Writes are not acknowledged until data has been fsynced to disk. (The lbs utility supports a "data loss mode" which skips fsyncing for debugging purposes.)
- Kivaloo is strongly consistent: If operation A completes before operation B starts, operation B will see the results of operation A.
- Unlike most data stores based on log-structured storage, the kvlds utility performs background cleaning based on the I/O rate and the amount of disk space used; thus there is no need to for separate periodic "compaction" to be performed.
- Due to the use of a B+Tree to store key-value pairs, kivaloo supports "RANGE" requests.
On an EC2 c1.medium instance using ephemeral-disk storage, with a 2 kB B+Tree page size and 40-byte keys and 40-byte values:
- Bulk inserts run at about 125,000 pairs per second.
- Bulk extracts run at about 30,000 pairs per second while in RAM, and about 20,000 pairs per second from disk.
- Bulk updates run at about 110,000 pairs per second while in RAM, drop to about 75,000 pairs per second while pages are in the operating system disk cache, and then to 60,000 pairs per second when the keys being updated need to be loaded from disk.
- Random reads run at about 160,000 to 220,000 pairs per second while in RAM, drop to about 6,000 to 11,000 pairs per second while pages are in the operating system disk cache, and are disk-seek-limited when pages need to be loaded from disk.
- Random mixed (50% read, 50% update) runs at about 14,000 to 30,000 pairs per second while in RAM, drops to around 1,000 to 4,000 pairs per second while in the operating system disk cache, and is disk-seek-limited when pages need to be loaded from disk.
- Hot-spot read (pick a random set of 65536 consecutive keys, read them in a random order, and repeat — this is similar to one of Tarsnap's read access patterns) runs at about 220,000 pairs per second while in RAM, and about 60,000 pairs per second from disk.
For more details, see the kivaloo performance page.
The following versions of kivaloo are available:
|Version||Release date||SHA256 hash|
The kivaloo data store is discussed on the email@example.com mailing list.