Withdraw
Loading…
Information sampling from online social networks
Kumar, Suhansanu
Loading…
Permalink
https://hdl.handle.net/2142/110469
Description
- Title
- Information sampling from online social networks
- Author(s)
- Kumar, Suhansanu
- Issue Date
- 2021-04-14
- Director of Research (if dissertation) or Advisor (if thesis)
- Sundaram, Hari
- Doctoral Committee Chair(s)
- Sundaram, Hari
- Committee Member(s)
- Tong, Hanghang
- Koyejo, Sanmi
- Jiang, Meng
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- sampling
- network
- graph
- online social network
- reinforcement learning
- hidden population
- content
- Abstract
- Data sampling from online social networks is a pre-requisite step for several downstream applications. Further, the massive size of the online social networks coupled with several API limitations and restrictions to the social information makes sampling a challenging problem. This thesis addresses some of the sampling challenges by proposing novel samplers for sampling attributes (content), hidden attributes (population), and networks from online social networks. Specifically, we first propose an information-based sampler in Chapter 3 for sampling content from online social networks. We leverage the surprise of content to direct our sampler towards informative content. The surprise-based sampling strategy allows us to sample the cluster shape and boundary of content clusters efficiently, which is crucial for several data-mining tasks, including clustering, classification, regression, and attribute discovery. We demonstrate our proposed sampler's efficacy on a suite of thirty real-world networks and four data-mining tasks. We further show through empirical counterfactual analysis that network structure does not hinder the performance of surprise-based link-trace samplers in many real-world datasets. Next in Chapter 4, we propose a novel attributed search-based sampler to sample hidden populations. We use a decision-tree-based search strategy to query the attribute-search space systematically. Our proposed decision-tree Thompson sampler follows the exploration and exploitation strategy to sample hidden populations from social networks. We demonstrate our sampler's efficacy over a suite of fourteen sampling tasks on three online social sites and five offline datasets. Furthermore, we show the impact of several factors, like page size, missing information, and noise, affecting hidden population sampling in real-world social networks. Finally, in Chapter 5, we propose a novel framework for learning network samplers. First, we show through theoretical and empirical proof that there exists no universal network sampler that can preserve all the topological properties of the underlying graph in the sample. To address the non-existence issue, we propose a reinforcement learning framework that learns high-quality sampling policies according to application needs. We demonstrate the efficacy of our proposed sampling framework through extensive experiments across ten different graph families and seven diverse tasks. In summary, this thesis develops several sampling strategies for sampling information (attribute, hidden attribute, network) from online social networks while being cognizant of API restrictions' constraints. We propose adaptive samplers that can cater to different application needs.
- Graduation Semester
- 2021-05
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/110469
- Copyright and License Information
- Copyright 2021 Suhansanu Kumar
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…