Please use this identifier to cite or link to this item:
http://dspace.bits-pilani.ac.in:8080/jspui/xmlui/handle/123456789/8152
Title: | Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems |
Authors: | Goyal, Poonam Goyal, Navneet |
Keywords: | Computer Science Hierarchical Agglomerative Clustering High Performance Computing Big Data Automatic Parallelization |
Issue Date: | 2019 |
Publisher: | IEEE |
Abstract: | Hierarchical Agglomerative Clustering (HAC) algorithms are used in many applications where clusters have a hierarchical relationship between them. Their parallelization is challenging due to the dependence of every agglomeration step on all previous agglomerations. Although a few parallel algorithms have been proposed for SLINK HAC algorithm, only limited work has been done to parallelize other HAC algorithms. In this paper, we present a high-level abstraction, which provides a uniform way to specify any HAC algorithm, and a framework for automatic parallelization of the same for distributed memory systems. The abstraction is supported by constructs in a high level, domain specific language, and a compiler translates algorithms expressed in this language to efficient parallel code targeting distributed systems. Our experiments on multiple HAC algorithms proves that the runtime performance achieved is comparable with state-of-the-art manual parallel implementations on Spark and MPI while requiring only a fraction of the programming effort. At runtime, master-slave execution is used, and load is balanced among the slaves in an algorithm-agnostic way, which is a significant contrast to custom load-balancing techniques seen in the literature on parallel HAC algorithms. |
URI: | https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9006390 http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8152 |
Appears in Collections: | Department of Computer Science and Information Systems |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.