A Domain Specific Language for Clustering
No Thumbnail Available
Date
2016-11
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer
Abstract
Clustering of large volumes of data is a complex problem which requires use of sophisticated algorithms as well as High Performance Computing hardware like a cluster of computers. It is highly desirable that data mining experts have a solution which on one hand provides a simple interface for ex-pressing their algorithms in terms of domain specific idioms and on the other hand automatically generates parallel code that can run on a cluster of multicore nodes. The proposed Domain Specific Language (DSL) along with its parallelizing compiler attempts to provide a solution. In this paper, we give the design of the DSL, called DWARF. Various language constructs have been described along with the rationale behind their inclusion in the language. A qualitative comparison of abstraction provided by DWARF is compared with MapReduce, Spark, and other MPI-based implementations to establish the usefulness of the proposed clustering DSL.
Description
Keywords
Computer Science, Clustering, Domain Specific Language, High Performance Computing, Big Data