Enhancing Transactional Key-Value Storage Systems in Datacenters using Precise Clocks and Software-Defined Storage
Transactional key-value storage is an important service offered by cloud service providers for building applications (e.g., Amazon DynamoDB, Microsoft CosmosDB, Google Spanner). This type of service is popular because it provides high-level guarantees like consistency, scalability and fault-tolerance to ease application development and deployment on the cloud. Unfortunately, providing high performance without high complexity entails several challenges for transactional key-value storage systems in datacenters due to several sophisticated protocols that provide the high-level guarantees (e.g., transaction and replication), and the overheads incurred by traversing multiple abstraction layers.
We leverage two emerging datacenter capabilities --- precise synchronized clocks and software-defined storage --- to address the performance and complexity challenges with transactional key-value storage systems in datacenters. To this end, we use a cross-layer approach that investigates all levels of the storage stack, from developer APIs to underlying hardware. We show that this methodology opens avenues for synergistic interactions between software and the underlying hardware, and leads to simpler system designs with better performance.
This dissertation presents 4 systems --- Semel, Milana, Kairos and SkimpyFTL. Semel is a multi-version key-value storage system that exploits the remap-on-write property of flash-based Solid State Drives for device-integrated multi-versioning and uses a simplified, unordered (inconsistent) replication protocol for fault tolerance. Milana supports serializable ACID transactions over Semel using an enhanced Optimistic Concurrency Control protocol that leverages intra-datacenter precisely synchronized clocks to reduce transaction abort rate and enable local validation of read-only transactions. Kairos builds over Milana and adds support for inter-transaction caching and sharded transaction validation; cache consistency in Kairos is based on a simple, stateless, time-to-live protocol with leases, without having to track sharers or send invalidations like with directory-based cache consistency protocols. Finally, SkimpyFTL builds over Semel and adds support for memory-efficient data indexing in flash-based key-value storage systems.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Duke Dissertations