Withdraw
Loading…
Replication-aware file-system crash consistency
Bhandari, Chaitanya Bhushan
This item's files can only be accessed by the System Administrators group.
Permalink
https://hdl.handle.net/2142/124715
Description
- Title
- Replication-aware file-system crash consistency
- Author(s)
- Bhandari, Chaitanya Bhushan
- Issue Date
- 2024-04-30
- Director of Research (if dissertation) or Advisor (if thesis)
- Alagappan, Ramnatthan
- Ganesan, Aishwarya
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Distributed Systems
- Storage Systems
- Block Stores
- File Systems
- Crash Consistency
- Abstract
- Block stores are at the heart of today’s cloud environments. Several systems including file systems in VMs and distributed file services depend upon block stores for storing their data. All major cloud providers, as a result, offer block storage services: Azure Managed Disk, Google Cloud Persistent Disk, AWS Elastic Block Storage, IBM Cloud Block Storage, and Alibaba Cloud Block Storage. Several open-source block stores (e.g., Ceph Block Device, Sheepdog Block Device, Gluster Block Device, and OpenStack Cinder) exist as well. A defining attribute of block stores is their thin interface with limited and simple commands such as read, write, and flush. While the simplicity has made the block interface stable (it hasn’t really changed in 40 years), it leads to inefficiencies. In particular, block stores understand little about the higher layers such as file systems, and how those layers use the block store. Our thesis is that by making the block store aware of the semantics of its clients (especially file systems), the inherent redundancy in the system can be used to make strong crash consistency more performant and resource-efficient. To this end, we introduce Replication-Aware File-System Crash Consistency (RepCC), a new technique that achieves data-journaling level crash consistency (the strongest possible crash consistency) at nearly the cost of ext4’s ordered journaling, a mode that provides weaker crash consistency guarantees. We implement RepCC on an open-source block store called Sheepdog and evaluate it using filebench macrobenchmarks and YCSB atop two real applications SQLite and RocksDB. Our evaluation shows that RepCC provides strong consistency guarantees of data journaling at better performance, storage bandwidth utilization, and network bandwidth utilization than data journaling, and at a small performance overhead over ordered journaling. Concretely, we find that RepCC increases throughput over ext4’s data journaling mode by 2.01% to 29.65% for the evaluated real applications and by 1.47× to 4.47× for the microbenchmarks. Additionally, RepCC reduces storage bandwidth utilization by 35.11% to 72.24%, and network bandwidth utilization by 15.80% to 74.94%. Finally, we find that RepCC imposes only a 0.2% to 22.32% throughput penalty over ext4’s ordered journaling mode.
- Graduation Semester
- 2024-05
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2024 Chaitanya Bhushan Bhandari
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…