Integrating multiple conflicting sources by truth discovery and source quality estimation
Zhi, Shi
Loading…
Permalink
https://hdl.handle.net/2142/50493
Description
Title
Integrating multiple conflicting sources by truth discovery and source quality estimation
Author(s)
Zhi, Shi
Issue Date
2014-09-16
Director of Research (if dissertation) or Advisor (if thesis)
Han, Jiawei
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Truth Discovery
Data Integration
Data Quality
Abstract
Multiple descriptions about the same entity from different sources will inevitably result in data or information inconsistency. Among conflicting pieces of information, which one is the most trustworthy? How to detect the fraudulence of a rumor? Obviously, it is unrealistic to curate and validate the trustworthiness of every piece of information because of the high cost of human labeling and lack of experts. To find the truth of each entity, much research work has shown that considering the quality of information providers can improve the performance of data integration. Due to different quality of data sources, it is hard to find a general solution that works for every case. Therefore, we start from a general setting of truth analysis at first and narrow down to two basic problems in data integration. We first propose a general framework to deal with numerical data with flexibility of defining loss function. Source quality is represented by a vector to model the source credibility in different error interval. Then we propose a new method called No Truth Truth Model(NTTM) to deal with truth existence problem in low-quality data. Preliminary experiments on real stock data and slot filling data show promising results.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.