Abstract:
Hierarchical Agglomerative Clustering (HAC) algorithms
are used in many applications where clusters have
a hierarchical relationship between them. Their parallelization
is challenging due to the dependence of every agglomeration
step on all previous agglomerations. Although a few parallel
algorithms have been proposed for SLINK HAC algorithm,
only limited work has been done to parallelize other HAC
algorithms. In this paper, we present a high-level abstraction,
which provides a uniform way to specify any HAC algorithm,
and a framework for automatic parallelization of the same
for distributed memory systems. The abstraction is supported
by constructs in a high level, domain specific language, and
a compiler translates algorithms expressed in this language
to efficient parallel code targeting distributed systems. Our
experiments on multiple HAC algorithms proves that the runtime
performance achieved is comparable with state-of-the-art manual
parallel implementations on Spark and MPI while requiring only
a fraction of the programming effort. At runtime, master-slave
execution is used, and load is balanced among the slaves in an
algorithm-agnostic way, which is a significant contrast to custom
load-balancing techniques seen in the literature on parallel HAC
algorithms.