DSpace Repository

Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems

Show simple item record

dc.contributor.author Goyal, Poonam
dc.contributor.author Goyal, Navneet
dc.date.accessioned 2022-12-27T06:50:14Z
dc.date.available 2022-12-27T06:50:14Z
dc.date.issued 2019
dc.identifier.uri https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9006390
dc.identifier.uri http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8152
dc.description.abstract Hierarchical Agglomerative Clustering (HAC) algorithms are used in many applications where clusters have a hierarchical relationship between them. Their parallelization is challenging due to the dependence of every agglomeration step on all previous agglomerations. Although a few parallel algorithms have been proposed for SLINK HAC algorithm, only limited work has been done to parallelize other HAC algorithms. In this paper, we present a high-level abstraction, which provides a uniform way to specify any HAC algorithm, and a framework for automatic parallelization of the same for distributed memory systems. The abstraction is supported by constructs in a high level, domain specific language, and a compiler translates algorithms expressed in this language to efficient parallel code targeting distributed systems. Our experiments on multiple HAC algorithms proves that the runtime performance achieved is comparable with state-of-the-art manual parallel implementations on Spark and MPI while requiring only a fraction of the programming effort. At runtime, master-slave execution is used, and load is balanced among the slaves in an algorithm-agnostic way, which is a significant contrast to custom load-balancing techniques seen in the literature on parallel HAC algorithms. en_US
dc.language.iso en en_US
dc.publisher IEEE en_US
dc.subject Computer Science en_US
dc.subject Hierarchical Agglomerative Clustering en_US
dc.subject High Performance Computing en_US
dc.subject Big Data en_US
dc.subject Automatic Parallelization en_US
dc.title Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems en_US
dc.type Article en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account