A Fast, Scalable SLINK Algorithm for Commodity Cluster Computing Exploiting Spatial Locality

Goyal, Navneet; Goyal, Poonam

DSpace Home
→
BITS Faculty Publications
→
Department of Computer Science and Information Systems
→
View Item

dc.contributor.author	Goyal, Navneet
dc.contributor.author	Goyal, Poonam
dc.date.accessioned	2022-12-26T09:16:57Z
dc.date.available	2022-12-26T09:16:57Z
dc.date.issued	2016
dc.identifier.uri	https://ieeexplore.ieee.org/document/7828388
dc.identifier.uri	http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8127
dc.description.abstract	Single linkage (SLINK) hierarchical clustering algorithm is a preferred clustering algorithm over traditional partitioning-based clustering as it does not require the number of clusters as input. But, due to its high time complexity and inherent data dependencies, it does not scale well for large datasets. To the best of our knowledge, all existing parallel SLINK algorithms are based on the traditional SLINK algorithm and thus require large number of computing resources. In this paper, we present a novel optimization of SLINK algorithm, GridSLINK, which is an order of magnitude faster than the existing state-of-the-art implementation. The optimization in GridSLINK comes from reduction in number of distance calculations required by SLINK. This reduction is achieved by exploiting spatial locality of data points and using an adaptive gridding technique. GridSLINK is parallelized for distributed memory systems. Scalable performance is achieved for increasing number of compute nodes. The proposed parallel algorithm, dGridSLINK, is benchmarked against the best existing parallel algorithm in literature and found to outperform the latter for all the real datasets considered. dGridSLINK can cluster millions of data points in few seconds/minutes using a small number of processing elements, without compromising the quality of clustering.	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.subject	Computer Science	en_US
dc.subject	Parallel computing	en_US
dc.subject	Multi-core processors	en_US
dc.subject	Multi-node	en_US
dc.subject	Clustering	en_US
dc.subject	SLINK	en_US
dc.title	A Fast, Scalable SLINK Algorithm for Commodity Cluster Computing Exploiting Spatial Locality	en_US
dc.type	Article	en_US

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Department of Computer Science and Information Systems [1099]

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

A Fast, Scalable SLINK Algorithm for Commodity Cluster Computing Exploiting Spatial Locality

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account