Incremental MapReduce for K-Medoids Clustering of Big Time-Series Data
No Thumbnail Available
Date
2018
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Abstract
There is a high necessity to refresh the data mining results, as the former results become stale and obsolete over time due to dynamic and evolving data. Clustering is one of the important data mining techniques that help to group data points with similarity together. To mine the data generated exponentially in these days, MapReduce, a parallel programming framework can be combined MapReduce with the k-medoids clustering algorithm to arrive at the optimum results quickly. Due to the parallel processing architecture of Hadoop, the proposed iterative algorithm for processing incremental data using an intermediate key file exhibited better performance over conventional k-medoids.
Description
Keywords
K-Medoids, Big Data, MapReduce, Clustering, Time series data