Withdraw
Loading…
Algorithms on graph-structured data with imperfect information
Zhou, Huozhi
Loading…
Permalink
https://hdl.handle.net/2142/105609
Description
- Title
- Algorithms on graph-structured data with imperfect information
- Author(s)
- Zhou, Huozhi
- Issue Date
- 2019-06-19
- Director of Research (if dissertation) or Advisor (if thesis)
- Varshney, Lav R.
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Graph data
- Imperfect information
- Efficient inference
- Abstract
- Graph-structured data is able to characterize pairwise or even higher-order relations among different data points, and has been demonstrated to be highly advantageous in various data mining and machine learning applications. Such graph-structured data may either come from real life networks, or some transformation based on data points. However, in practice the measurement of graph-structured data is usually partially incomplete or incorrect. For example, the measured states of nodes in the graph might be incorrect due to sensor noise. In this thesis, we study two problems on graph-structured data with imperfect information: hypergraph-based active learning and source estimation on directed acyclic graphs (DAGs), all with provable statistical guarantees. In the first part of this thesis, we propose an active learning scheme which is able to accommodate the structure of hypergraphs, termed HS2. HS2 generalizes the previously proposed S2 algorithm which is only able to solve graph-based active learning (GAL) with pointwise oracle. Our HS2 is more flexible in the sense that it is adaptable for three different types of oracles: pointwise oracle, pairwise oracle, as well as noisy pairwise oracle. Based on a novel parametric system particularly designed for hypergraphs, we derive theoretical results on the query complexity of HS2 for the above described settings. Both the theoretical and empirical results show that HS2 outperforms the naive combination of clique expansion and GAL algorithms. Next we develop a heuristic, termed generalized Jordan center (GJC), to estimate the source of a spreading process on a DAG based on noisy and incomplete observations. This problem is motivated by contamination diffusion in a food supply chain. For this setting, identifying the source correctly and efficiently as well as inferring states of unobserved events are of top priorities (the recall problem). We believe this is the first work on source estimation with noisy information. Under mild conditions, GJC is the maximum likelihood (ML) estimator of the diffusion source. Our proposed heuristic is parameter-free (only needs to know the structure of the DAG and states of some nodes), and can be evaluated efficiently by a message-passinglike algorithm in ~O (jV j) complexity (the tilde notation means ignoring the logarithm factor), where V is the vertex set. Experiments on both synthetic and real networks show that GJC has significant gains over a naive extension of Jordan center and is comparable to the exact ML estimate, in terms of source detection probability and false negative rate for recall.
- Graduation Semester
- 2019-08
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/105609
- Copyright and License Information
- Copyright 2019 Huozhi Zhou
Owning Collections
Dissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringGraduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…