Data Mining via Support Vector Machines: Scalability, Applicability, and Interpretability

Yu, Hwan-Jo

Data Mining via Support Vector Machines: Scalability, Applicability, and Interpretability

Yu, Hwan-Jo

This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.

Permalink

https://hdl.handle.net/2142/81644

Description

Title: Data Mining via Support Vector Machines: Scalability, Applicability, and Interpretability
Author(s): Yu, Hwan-Jo
Issue Date: 2004
Doctoral Committee Chair(s): Han, Jiawei
Department of Study: Computer Science
Discipline: Computer Science
Degree Granting Institution: University of Illinois at Urbana-Champaign
Degree Name: Ph.D.
Degree Level: Dissertation
Date of Ingest: 2015-09-25T20:19:40Z
Keyword(s): Computer Science
Language: eng
Abstract: KDD (Knowledge Discovery and Data mining) has been extensively studied in the last decade as data is continuously increasing in size and complexity. This thesis introduces three practical data mining problems---(1) classifying with large data sets, (2) classifying without negative data (i.e., single-class classification), and (3) discovering discriminant feature combinations---and presents solutions that are based on a principled methodology, i.e., Support Vector Machines (SVMs), to produce higher quality results with less human intervention. We first address several challenges in adopting SVM technology to the practice of data mining: (1) scalability: SVMs are unscalable to data size while common data mining applications often involve millions or billions of data objects, (2) applicability: SVMs are limited to (semi-) supervised learning which is mostly applied to binary classification problems, and (3) interpretability: It is hard to interpret and extract knowledge from SVM models. We then propose three principled solutions, which address these challenges, for the problems of the large-scale classification, the single-class classification, and the discriminant feature combination discovery. The contributions of this thesis cover the applications of bioinformatics and text-and-Web mining as well as methodologies of data mining and machine learning.
Graduation Semester: 2004
Type of Resource: text
Permalink: http://hdl.handle.net/2142/81644

Data Mining via Support Vector Machines: Scalability, Applicability, and Interpretability

Yu, Hwan-Jo

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Data Mining via Support Vector Machines: Scalability, Applicability, and Interpretability

Yu, Hwan-Jo

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In