DSpace logo

Please use this identifier to cite or link to this item: http://dspace.bits-pilani.ac.in:8080/jspui/xmlui/handle/123456789/8125
Title: A High Performance Computing Framework for Data Mining
Authors: Goyal, Navneet
Goyal, Poonam
Keywords: Computer Science
Big Data Analytics
Data Mining
Domain Specific Language
High Performance Computing
Issue Date: 2016
Publisher: IEEE
Abstract: Mining large data sets is no longer the prerogative of computer scientists - specialists in a wide variety of domains are performing analytics as a day-to-day activity. Often such analyses are specific to the domain and analysts are required to devise new algorithms or techniques. For such scenarios, providing a high-level programming environment that delivers high performance on clusters is a challenge. We propose a framework that supports high-level programming using domain abstractions in data mining while delivering scalable performance on commodity clusters i.e. clusters of multi-core workstations. This framework includes a domain specific programming language, DWARF, to enable data mining specialists to rapidly prototype algorithms. DWARF is supported by a compiler that automatically parallelizes code by identifying domain specific patterns and translating them to parallel code that exploits data parallelism and task parallelism. The compiler generates code for a hybrid virtual machine supporting distributed memory model at the top level and shared memory model nested within. The code generated by the compiler can be scheduled on commodity clusters. We compare the proposed framework with other frameworks commonly used for data mining on distributed platforms.
URI: https://ieeexplore.ieee.org/document/7837043
http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8125
Appears in Collections:Department of Computer Science and Information Systems

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.