Withdraw
Loading…
Scalable Mining and Link Analysis Across Multiple Database Relations
Yin, Xiaoxin
Loading…
Permalink
https://hdl.handle.net/2142/11285
Description
- Title
- Scalable Mining and Link Analysis Across Multiple Database Relations
- Author(s)
- Yin, Xiaoxin
- Issue Date
- 2007-03
- Keyword(s)
- database
- data mining
- Abstract
- Relational databases are the most popular repository for structured data, and are thus one of the richest sources of knowledge in the world. In a relational database, multiple relations are linked together via entity-relationship links. Unfortunately, most existing data mining approaches can only handle data stored in single tables, and cannot be applied to relational databases. Therefore, it is an urgent task to design data mining approaches that can discover knowledge from multi-relational data. In this thesis we study three most important data mining tasks in multi-relational environments: classification, clustering, and duplicate detection. Since information is widely spread across multiple relations, the most crucial and common challenge in multi-relational data mining is how to utilize the relational information linked with each object. We rely on two types of information, --- neighbor tuples and linkages between objects, to analyze the properties of objects and relationships among them. Because of the complexity of multi-relational data, efficiency and scalability are two major concerns in multi-relational data mining. In this thesis we propose scalable and accurate approaches for each data mining task studied. In order to achieve high efficiency and scalability, the approaches utilize novel techniques for virtually joining different relations, single-scan algorithms, and multi-resolutional data structures to dramatically reduce computational costs. Our experiments show that our approaches are highly efficient and scalable, and also achieve high accuracies in multi-relational data mining.
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/11285
- Copyright and License Information
- You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the University of Illinois at Urbana-Champaign Computer Science Department under terms that include this permission. All other rights are reserved by the author(s).
Owning Collections
Manage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…