DSpace logo

Please use this identifier to cite or link to this item: http://dspace.bits-pilani.ac.in:8080/jspui/xmlui/handle/123456789/8114
Title: μDBSCAN: An Exact Scalable DBSCAN Algorithm for Big Data Exploiting Spatial Locality
Authors: Goyal, Navneet
Goyal, Poonam
Challa, Jagat Sesh
Keywords: Computer Science
DBSCAN Algorithm
Big Data
Issue Date: 2019
Publisher: IEEE
Abstract: DBSCAN is one of the most popular and effective clustering algorithms that is capable of identifying arbitrary-shaped clusters and noise efficiently. However, its super-linear complexity makes it infeasible for applications involving clustering of Big Data. A major portion of the computation time of DBSCAN is taken up by the neighborhood queries, which becomes a bottleneck to its performance. We address this issue in our proposed micro-cluster based DBSCAN algorithm, μDBSCAN, which identifies core-points even without performing neighbourhood queries and becomes instrumental in reducing the run-time of the algorithm. It also significantly reduces the computation time per neighbourhood query while producing exact DBSCAN clusters. Moreover, the micro-cluster based solution makes it scalable for high dimensional data. We also propose a highly scalable distributed implementation of μDBSCAN, μDBSCAN-D, to exploit a commodity cluster infrastructure. Experimental results demonstrate tremendous improvements in performance of our proposed algorithms as compared to their respective state-of-the-art solutions for various standard datasets. μDBSCAN-D is an exact parallel solution for DBSCAN which is capable of processing massive amounts of data efficiently (1 billion data points in 41 minutes on a 32 node cluster), while producing a clustering that is same as that of traditional DBSCAN.
URI: http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8114
ISSN: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8891020
Appears in Collections:Department of Computer Science and Information Systems

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.