On Clustering Techniques
On Clustering Techniques
Clustering data into sensible groupings, as a fundamental and effective tool for efficient data organization, summarization, understanding and learning, has been the subject of active research in several fields such as statistics (Jain & Dubes, 1988; Hartigan, 1975), machine learning (Dempster, Laird & Rubin, 1977), Information theory (Linde, Buzo & Gray, 1980), databases (Guha, Rastogi & Shim, 1998; Zhang, Ramakrishnan & Livny, 1996) and Bioinformatics (Cheng & Church, 2000) from various perspectives and with various approaches and focuses. From application perspective, clustering techniques have been employed in a wide variety of applications such as customer segregation, hierarchal document organization, image segmentation, microarray data analysis and psychology experiments. Intuitively, the clustering problem can be described as follows: Let W be a set of n entities, finding a partition of W into groups such that the entities within each group are similar to each other while entities belonging to different groups are dissimilar. The entities are usually described by a set of measurements (attributes). Clustering does not use category information that labels the objects with prior identifiers. The absence of label information distinguishes cluster analysis from classification and indicates that the goals of clustering is just finding a hidden structure or compact representation of data instead of discriminating future data into categories.
CITATION: Ma, Sheng. On Clustering Techniques edited by Wang, John . Hershey : IGI Global , 2008. Encyclopedia of Data Warehousing and Mining, Second Edition - Available at: https://library.au.int/clustering-techniques