Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems

Goyal, Poonam; Goyal, Navneet

DSpace Home
→
BITS Faculty Publications
→
Department of Computer Science and Information Systems
→
View Item

Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems

Goyal, Poonam; Goyal, Navneet

URI: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9006390
http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8152

Date: 2019

Abstract:

Hierarchical Agglomerative Clustering (HAC) algorithms are used in many applications where clusters have a hierarchical relationship between them. Their parallelization is challenging due to the dependence of every agglomeration step on all previous agglomerations. Although a few parallel algorithms have been proposed for SLINK HAC algorithm, only limited work has been done to parallelize other HAC algorithms. In this paper, we present a high-level abstraction, which provides a uniform way to specify any HAC algorithm, and a framework for automatic parallelization of the same for distributed memory systems. The abstraction is supported by constructs in a high level, domain specific language, and a compiler translates algorithms expressed in this language to efficient parallel code targeting distributed systems. Our experiments on multiple HAC algorithms proves that the runtime performance achieved is comparable with state-of-the-art manual parallel implementations on Spark and MPI while requiring only a fraction of the programming effort. At runtime, master-slave execution is used, and load is balanced among the slaves in an algorithm-agnostic way, which is a significant contrast to custom load-balancing techniques seen in the literature on parallel HAC algorithms.

Show full item record

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Department of Computer Science and Information Systems [1099]

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems

Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems

Abstract:

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account