Withdraw
Loading…
Isotropy criteria and algorithms for data clustering
Shetty, Sanketh V.
Loading…
Permalink
https://hdl.handle.net/2142/24235
Description
- Title
- Isotropy criteria and algorithms for data clustering
- Author(s)
- Shetty, Sanketh V.
- Issue Date
- 2011-05-25T14:59:07Z
- Director of Research (if dissertation) or Advisor (if thesis)
- Ahuja, Narendra
- Doctoral Committee Chair(s)
- Ahuja, Narendra
- Committee Member(s)
- Forsyth, David A.
- Han, Jiawei
- Hasegawa-Johnson, Mark A.
- Huang, Thomas S.
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Data Clustering
- Isotropy Criteria
- Abstract
- Given a set of points, the goal of data clustering is to group them into clusters, such that the internal homogeneity of points within each cluster contrasts to inter-cluster heterogeneity. Over the last fifty years, many methods for data clustering have been developed in diverse scientific communities. However, many of these methods suffer from several shortcomings, and are unable to handle the rich diversity of cluster structures that are usually present in data. We develop an unsupervised, nonparametric approach to data clustering that addresses these shortcomings. Our goal is to build on the strengths of these methods, while simultaneously offering innovative solutions to their limitations. In our cluster model, clusters are seen as groups of points, with overlapping neighborhoods, that have similar spatial structures that are in contrast with their surroundings. We use the isotropy of a point distribution to characterize spatial structure. We argue that identifying the isotropic density neighborhoods of a point, helps in the detection of a diversity of cluster structures that are challenging to many other methods. We develop three different criteria for identifying neighborhoods with isotropic density. The first criterion is based on examining properties of one-dimensional projections in a hyperspherical neighborhood with uniform point distribution. The second and third criteria are based on the analysis of the force transform, generalized to arbitrary inner product spaces. We use these criteria to develop methods for data clustering. In our first approach, we define a cluster as a set of contiguous interior points surrounded by border points. We use the isotropy criteria as a threshold to differentiate between interior and border points. We group interior points to form cluster cores, and then identify cluster borders as formed by the border points neighboring the cluster cores. The algorithm is effective in resolving clusters of different shapes, sizes and densities. It is relatively insensitive to outliers. In our second approach, we adopt the idea of shift vector computation for cluster detection. We present a novel scale-adaptive approach where clusters are detected by moving all points to their cluster cores using shift vectors. We use the isotropic density neighborhoods of a point to determine where and how the shift vectors are computed. We then construct a directed graph induced by these shift vectors. We develop several algorithms for the analysis of this digraph to produce clusters. They range from a simple directed-tree approach to sophisticated spectral analysis of the digraph. We evaluate these algorithms on challenging real datasets and compare their performance against popular and classic algorithms in data clustering, such as k-means, mixture model clustering, spectral clustering, density-based and shift based methods. We show that our methods outperform these algorithms on many challenging clustering tasks. We also apply our algorithms to image and video segmentation.
- Graduation Semester
- 2011-05
- Permalink
- http://hdl.handle.net/2142/24235
- Copyright and License Information
- Copyright 2011 Sanketh V. Shetty
Owning Collections
Dissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringGraduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…