Withdraw
Loading…
Uncovering urban dynamics via cross-modal representation learning
Zhang, Keyang
Loading…
Permalink
https://hdl.handle.net/2142/97784
Description
- Title
- Uncovering urban dynamics via cross-modal representation learning
- Author(s)
- Zhang, Keyang
- Issue Date
- 2017-04-27
- Director of Research (if dissertation) or Advisor (if thesis)
- Han, Jiawei
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Urban dynamics
- Activity modeling
- Representation learning
- Embedding
- Abstract
- With the ever-increasing urbanization process, systematically modeling people's activities in the urban space is being recognized as a crucial socioeconomic task. This task was nearly impossible years ago due to the lack of reliable data sources, yet the emergence of geo-tagged social media (GTSM) data sheds new light on it. Recently, there have been fruitful studies on discovering geographical topics from GTSM data. However, their high computational costs and strong distributional assumptions about the latent topics hinder them from fully unleashing the power of GTSM. To bridge the gap, we present CrossMap, a novel cross-modal representation learning method that uncovers urban dynamics with massive GTSM data. After extracting activity-related tweets by measuring the dispersion degree of each keyword, CrossMap first employs an accelerated mode seeking procedure on all the extracted activity-related tweets to detect the spatiotemporal hotspots underlying people's activities. Those detected hotspots not only address spatiotemporal variations, but also largely alleviate the data sparsity of the GTSM data. With the detected hotspots, CrossMap then jointly embeds all spatial, temporal, and textual units into the same space using two different strategies: one is reconstruction-based and the other is graph-based. Both strategies capture the correlations among the units by encoding their co-occurrence and neighborhood relationships, and learn low-dimensional representations to preserve such correlations. Our experiments show that CrossMap not only significantly outperforms state-of-the-art methods for activity recovery, but also greatly benefits downstream applications like activity classification. Further, CrossMap is capable of processing millions of GTSM records within minutes, making it suitable for monitoring large-scale GTSM streams in practice. We also further extend our model in two ways. Firstly, we adopt a novel semi-supervised learning paradigm that leverages the activity category information to guide the embedding learning process to generate higher quality embeddings. Secondly, to overcome the existing models' incapability of dynamically accommodating the latest information in the GTSM stream, we propose a method that processes continuous GTSM streams and obtains recency-aware urban activity models on the fly, in order to reflect up-to-date urban activities.
- Graduation Semester
- 2017-05
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/97784
- Copyright and License Information
- Copyright 2017 Keyang Zhang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…