Withdraw
Loading…
Mining social sensing data: Representation, modeling, and applications
Shao, Huajie
Loading…
Permalink
https://hdl.handle.net/2142/113012
Description
- Title
- Mining social sensing data: Representation, modeling, and applications
- Author(s)
- Shao, Huajie
- Issue Date
- 2021-07-12
- Director of Research (if dissertation) or Advisor (if thesis)
- Abdelzaher, Tarek
- Doctoral Committee Chair(s)
- Abdelzaher, Tarek
- Committee Member(s)
- Han, Jiawei
- Ji, Heng
- Kaplan, Lance
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Truth Discovery
- Social Sensing
- Maximum-likelihood estimator
- Robustness
- Unsupervised learning
- Abstract
- "Social sensing refers to using humans as ""sensors"" to collect information about external physical events, such as disasters, protests, and traffic. This is motivated by the observations that today millions of people often share their observations in the physical world online via social media platforms and mobile apps. However, social sensing data are noisy, multi-modal, and unreliable. Errors are highly correlated as many people may retweet the same false information. In addition, social sensing data are very unique as its text content and graph structures vary a lot for different tasks. Hence, it requires human graders to label the data for new domains when using supervised learning, costing a lot of human labor. The main goal of this dissertation is to develop unsupervised solutions that learn latent structures and models from observed social media data to advance social sensing applications, including misinformation detection, link prediction, and polarity detection. The work lies in the exploration of learning systems that include both analytic schemes, where model structure is known ahead of time (but parameters need to be estimated), and data-driven schemes, where the latent model structure itself is to be learned from data without prior knowledge. As a means of exploring the gamut of such unsupervised schemes, we consider a broad range of social sensing applications that call for different modeling complexity. This dissertation first focuses on traditional maximum-likelihood estimators - a solution that estimates unknown parameter values of analytical likelihood models. We consider truth discovery that estimates the veracity of claims made by different users with unknown reliability on social media. We then extend the truth-finding algorithm by considering additional content features to further improve the quality estimation results. Accordingly, we propose two flavors of maximum likelihood estimators, EM-MultiF and PEM-MultiF, that jointly learn the importance of different content features together with the veracity of observations. In order to further improve truth discovery, we develop an optimal source selection model to minimize the expected fusion error when some sources can influence others. In order to better leverage multi-modal data in social sensing, next we consider representation learning that uncovers semantically meaningful latent factors from the observed data without the benefit of known model structure. We first develop a novel maximum-likelihood estimator, ControlVAE, that combines control theory with a Variational Autoencoder (VAE) to disentangle the latent factors from unstructured data. The proposed ControlVAE can not only disentangle the latent factors but also solve the posterior collapse problem. Finally, we apply ControlVAE and representation learning model to different social sensing applications, including misinformation detection and polarity analysis. A trade-off is discussed between the different modeling approaches in terms of applicability, accuracy, and robustness to inform the design of future social sensing systems."
- Graduation Semester
- 2021-08
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/113012
- Copyright and License Information
- Copyright 2021 Huajie Shao
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…