Data Cleaning Framework: An Extensible Approach to Data Cleaning
Gu, Randy S.
Loading…
Permalink
https://hdl.handle.net/2142/18304
Description
Title
Data Cleaning Framework: An Extensible Approach to Data Cleaning
Author(s)
Gu, Randy S.
Issue Date
2011-01-14T22:45:32Z
Director of Research (if dissertation) or Advisor (if thesis)
Chang, Kevin C-C.
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Data Cleaning
Abstract
The growing dependence of society on enormous quantities of information stored electronically has led to a corresponding rise in errors in this information. The stored data can be critically important, necessitating new ways of correcting anomalous records. Current cleaning techniques are very domain-specific and hard to extend, hindering their use in some areas. This work proposes an extensible framework for data cleaning, allowing users to customize the cleaning to their specific requirements. It defines categories of common cleaning operations, allowing more robust support for user-implemented cleaning functions in these categories. The experimental results show that the proposed data cleaning framework is an effective approach to cleaning data for arbitrary domains.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.