Withdraw
Loading…
Harnessing rare category trinity for complex data
Zhou, Dawei
Loading…
Permalink
https://hdl.handle.net/2142/113901
Description
- Title
- Harnessing rare category trinity for complex data
- Author(s)
- Zhou, Dawei
- Issue Date
- 2021-12-02
- Director of Research (if dissertation) or Advisor (if thesis)
- He, Jingrui
- Doctoral Committee Chair(s)
- He, Jingrui
- Committee Member(s)
- Han, Jiawei
- Ji, Heng
- Akoglu, Leman
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Rare Category Analysis, Graph Mining
- Abstract
- "In the era of big data, we are inundated with the sheer volume of data being collected from various domains. In contrast, it is often the rare occurrences that are crucially important to many high-impact domains with diverse data types. For example, in online transaction platforms, the percentage of fraudulent transactions might be small, but the resultant financial loss could be significant; in social networks, a novel topic is often neglected by the majority of users at the initial stage, but it could burst into an emerging trend afterward; in the Sloan Digital Sky Survey, the vast majority of sky images (e.g., known stars, comets, nebulae, etc.) are of no interest to the astronomers, while only 0.001% of the sky images lead to novel scientific discoveries; in the worldwide pandemics (e.g., SARS, MERS, COVID19, etc.), the primary cases might be limited, but the consequences could be catastrophic (e.g., mass mortality and economic recession). Therefore, studying such complex rare categories have profound significance and longstanding impact in many aspects of modern society, from preventing financial fraud to uncovering hot topics and trends, from supporting scientific research to forecasting pandemic and natural disasters. In this thesis, we propose a generic learning mechanism with trinity modules for complex rare category analysis: (M1) Rare Category Characterization - characterizing the rare patterns with a compact representation; (M2) Rare Category Explanation - interpreting the prediction results and providing relevant clues for the end-users; (M3) Rare Category Generation - producing synthetic rare category examples that resemble the real ones. The key philosophy of our mechanism lies in ""all for one and one for all"" - each module makes unique contributions to the whole mechanism and thus receives support from its companions. In particular, M1 serves as the de-novo step to discover rare category patterns on complex data; M2 provides a proper lens to the end-users to examine the outputs and understand the learning process; and M3 synthesizes real rare category examples for data augmentation to further improve M1 and M2. To enrich the learning mechanism, we develop principled theorems and solutions to characterize, understand, and synthesize rare categories on complex scenarios, ranging from static rare categories to time-evolving rare categories, from attributed data to graph-structured data, from homogeneous data to heterogeneous data, from low-order connectivity patterns to high-order connectivity patterns, etc. It is worthy of mentioning that we have also launched one of the first visual analytic systems for dynamic rare category analysis, which integrates our developed techniques and enables users to investigate complex rare categories in practice."
- Graduation Semester
- 2021-12
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/113901
- Copyright and License Information
- Copyright 2021 Dawei Zhou
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…