Withdraw
Loading…
Index design for information retrieval applications using database concepts
Mohit Vyas, -
Content Files

Loading…
Download Files
Loading…
Download Counts (All Files)
Loading…
Edit File
Loading…
Permalink
https://hdl.handle.net/2142/108068
Description
- Title
- Index design for information retrieval applications using database concepts
- Author(s)
- Mohit Vyas, -
- Issue Date
- 2020-05-15
- Director of Research (if dissertation) or Advisor (if thesis)
- Chang, Kevin
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Date of Ingest
- 2020-08-26T21:58:09Z
- Keyword(s)
- Information Retrieval
- Databases
- Index Design
- Entity Search
- Abstract
- Integrating database (DB) and information retrieval (IR) technologies has long been an important problem. Recent growth in tagged textual data, which is a combination of structured knowledge bases and unstructured text data, has made principled index design for IR applications even more desirable. In this work, we propose a framework for designing IR applications using DB concepts. We model an IR application as a specialized DB system that is required to support only a fixed predefined query workload rather than general SQL queries. The physical design of an information retrieval system is optimized for a prespecified target query workload. We then identify IR indexes as basic building blocks of the design of an IR application. An IR index is formalized as a look-up function built over a materialized view. The overall ”index design problem” is then to find an index design with lowest cost from a space of candidate designs. Since the space of candidate designs can be doubly exponential in the size of query workload, we give polynomial time heuristic algorithm with provable guarantees for a weighted set cover based relaxation of the original problem. This is then combined with an overarching branch-and-bound based optimality preserving state-space search algorithm that efficiently prunes the state-space by using the above heuristic algorithm. Finally, we show via experiments how the proposed framework can be used to obtain index designs for different kinds IR applications in practice.
- Graduation Semester
- 2020-05
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/108068
- Copyright and License Information
- Copyright 2020 - Mohit Vyas
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Siebel School of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…