Exact, Fast and Scalable Parallel DBSCAN for Commodity Platforms

Goyal, Navneet; Goyal, Poonam

DSpace Home
→
BITS Faculty Publications
→
Department of Computer Science and Information Systems
→
View Item

Exact, Fast and Scalable Parallel DBSCAN for Commodity Platforms

Goyal, Navneet; Goyal, Poonam

URI: https://dl.acm.org/doi/abs/10.1145/3007748.3007773
http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8121

Date: 2017-01

Abstract:

DBSCAN is one of the most popular density-based clustering algorithm capable of identifying arbitrary shaped clusters and noise. It is computationally expensive for large data sets. In this paper, we present a grid-based DBSCAN algorithm, GridDBSCAN, which is significantly faster than the state-of-the-art sequential DBSCAN. The efficiency of GridDBSCAN is achieved by reducing the number of neighborhood queries using spatial locality information, without compromising the quality of clusters. We also propose scalable parallel implementations of GridDBSCAN to leverage a multicore commodity cluster. Clustering results of GridDBSCAN and its parallel implementations are exactly the same as that of classical DBSCAN. The performance of proposed algorithms, both sequential and parallel, is benchmarked against the state-of-the-art algorithms by experimenting on various real datasets. Experimental results show considerable performance improvements achieved by GridDBSCAN and its parallel implementations.

Show full item record