Withdraw
Loading…
Pattern extraction and clustering for high-dimensional discrete data
Jiang, Peng
Loading…
Permalink
https://hdl.handle.net/2142/46604
Description
- Title
- Pattern extraction and clustering for high-dimensional discrete data
- Author(s)
- Jiang, Peng
- Issue Date
- 2014-01-16T17:55:55Z
- Director of Research (if dissertation) or Advisor (if thesis)
- Heath, Michael T.
- Doctoral Committee Chair(s)
- Heath, Michael T.
- Committee Member(s)
- Olson, Luke N.
- Zhai, ChengXiang
- Park, Haesun
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- low-rank matrix factorization
- binary matrix factorization
- k-means clustering
- approximation algorithm
- pattern extraction
- association rule mining
- document clustering
- weighted binary matrix factorization
- bicluster discovery
- densest k-subgraph
- social network mining
- Abstract
- We explore connections of low-rank matrix factorizations with interesting problems in data mining and machine learning. We propose a framework for solving several low-rank matrix factorization problems, including binary matrix factorization, constrained binary matrix factorization, weighted constrained binary matrix factorization, densest k-subgraph, and orthogonal nonnegative matrix factorization. These combinatorial problems are NP-hard. Our goal is to develop effective approximation algorithms with good theoretical properties and apply them to solve various real application problems. We reformulate each of the problems as a special clustering problem that has the same optimal solution as the corresponding original problem. Making use of this property, we develop clustering algorithms to solve corresponding low-rank matrix factorization problems. We prove that most of our clustering algorithms have constant approximation ratios, which is a highly desirable property for NP-hard problems. We apply the proposed algorithms and compare them with existing methods for real applications in pattern extraction, document clustering, transaction data mining, recommender systems, bicluster discovery in gene expression data, social network mining, and image representation.
- Graduation Semester
- 2013-12
- Permalink
- http://hdl.handle.net/2142/46604
- Copyright and License Information
- Copyright 2013 Peng Jiang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…