Withdraw
Loading…
Overcoming barriers in data warehouse replatforming
Aleyasen, Amirhossein
Loading…
Permalink
https://hdl.handle.net/2142/115434
Description
- Title
- Overcoming barriers in data warehouse replatforming
- Author(s)
- Aleyasen, Amirhossein
- Issue Date
- 2022-04-22
- Director of Research (if dissertation) or Advisor (if thesis)
- Winslett, Marianne
- Doctoral Committee Chair(s)
- Winslett, Marianne
- Committee Member(s)
- Park, Yongjoo
- Alawini, Abdussalam
- Ludäscher, Bertram
- Antova, Lyublena
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Workload Analysis, Data Warehousing, Porting Complexity, Database Replatforming, Adaptive Data Virtualization, Data Replication, Query Routing, Cloud Migration
- Abstract
- With the development of data-warehouse-as-a-service systems, an increasing number of companies have opted to migrate from on-premises warehouse systems to cloud-native systems. In spite of this, however, replatforming data warehouses poses significant challenges, due to the syntactic, semantic, functional, and performance differences between data warehousing vendors, such as differences in SQL dialects. In recent years, adaptive data virtualization was proposed as a way to reduce the costs associated with data warehouse migration, by keeping both the network and application layers intact and translating queries in real time between the old and new vendors' SQL dialects. Although adaptive data virtualization greatly reduces the cost of data warehouse replatforming, it still poses several challenges. One of the main challenges is an incomplete understanding of characteristics, needs, and behavior of existing workloads in traditional data warehouses. When moving from an on-premises data warehouse to a cloud-based data warehouse, run-time performance can be negatively impacted, even when there is sufficient understanding and analysis of the workload. This dissertation takes a step towards addressing the aforementioned challenges in adaptive data virtualization, by designing and demonstrating a workload analyzer that provides users with in-depth insights into their applications, users, data volume and data variety, logical and physical designs, tables, views, queries, performance, batch load behavior, and ETL processing. We use this workload analyzer to characterize 40 data warehouse workloads from some of the world’s largest enterprises, and highlight the unexpected findings and practical takeaways. We leverage the workload analyzer to design and evaluate a porting complexity model that informs users of the cost of replatforming and prioritize applications for replatforming projects. Deployment and evaluation of the porting complexity model shows that companies and experts find the information useful when they are considering migrating their data to a cloud data warehouse. Finally, to address cloud data warehouse throughput concerns for applications using adaptive data virtualization, we design, implement and evaluate a middleware-based database replication solution that combines the strengths of existing data replication methods while mitigating their limitations. Extensive experiments on real-world and benchmark workloads show the practicality and efficiency of the proposed solution.
- Graduation Semester
- 2022-05
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2022 Amirhossein Aleyasen
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…