Withdraw
Loading…
Clustering and comparing information extracted from personal health messages
Jiang, Yunliang
Loading…
Permalink
https://hdl.handle.net/2142/42485
Description
- Title
- Clustering and comparing information extracted from personal health messages
- Author(s)
- Jiang, Yunliang
- Issue Date
- 2013-02-03T19:47:26Z
- Director of Research (if dissertation) or Advisor (if thesis)
- Schatz, Bruce R.
- Doctoral Committee Chair(s)
- Schatz, Bruce R.
- Committee Member(s)
- Han, Jiawei
- Zhai, ChengXiang
- Mei, Qiaozhu
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Healthcare
- Personal Messages
- Classification
- Clustering
- Comparison
- Abstract
- The development of Web 2.0 techniques has led to the prosperity of online communities, which spread to various domains and areas in our daily life. When it comes to the medicine and healthcare domain, a series of good online services such as Yahoo! Groups,WebMD and Med- Help, offer patients and physicians a good platform to discuss health problems, e.g., diseases and drugs, diagnoses and treatments, which also provide a large volume of data for researchers to analyze and explore. However, some nature of the personal messages, e.g., unclean, unstructured and isolated from clinical practice, hinders users’ effective digestion of information in the front end and challenges the data analysis in the back end. In such a scenario, the objective of my thesis is to apply the advanced data mining, information retrieval and natural language processing techniques to effectively analyze and re-organize the rich source of personal health messages from online medical communities, in order to satisfy patients’ information need and support physicians’ clinical practice. Specially, in the first part of the dissertation, I introduce an SVM-based multi-class classification method which utilizes term-appearance, lexical and semantic features to effectively classify health messages sampled from our unique dataset of Yahoo! Health Groups into three categories: News, User Comments and Spam; in the second part, I depict a comprehensive system with an extensive evaluation framework to organize and cluster patient outcomes utilizing topic model, which groups large collections of personal comments into a series of topics, guided by expert comments; in the third part of the dissertation, I address a novel and promising topic: Comparative Effectiveness Research (CER) hypothesis prediction, by presenting a study which evaluates patients’ opinions on different treatments by machine enabled sentiment analysis or human analysts utilizing our MedHelp dataset. By suggesting three different methods to compare such opinions, reliable conclusions about the patients’ preference on different treatments can be drawn consistently, which imply the effectiveness of the treatments. Furthermore, the study is also extended to demographic analysis to explore the preference in specific group of people, representing population cohorts.
- Graduation Semester
- 2012-12
- Permalink
- http://hdl.handle.net/2142/42485
- Copyright and License Information
- Copyright 2012 Yunliang Jiang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…