Please use this identifier to cite or link to this item:
http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/8148
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Goyal, Poonam | |
dc.contributor.author | Goyal, Navneet | |
dc.contributor.author | Challa, Jagat Sesh | |
dc.date.accessioned | 2022-12-27T06:38:21Z | |
dc.date.available | 2022-12-27T06:38:21Z | |
dc.date.issued | 2021-03 | |
dc.identifier.uri | https://www.tandfonline.com/doi/full/10.1080/0952813X.2021.1882001 | |
dc.identifier.uri | http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8148 | |
dc.description.abstract | Clustering of data streams has become very popular in recent times, owing to rapid rise of real-time streaming utilities that produce large amounts of data at varying inter-arrival rates. We propose AnyClus, a framework for anytime clustering of data streams. AnyClus uses a proposed variant of R-tree, AnyRTree, to capture the incoming stream objects arriving at variable rate, and to index them in the form of micro-clusters of hierarchical fashion. The leaf-level micro-clusters produced are aggregated and stored in a logarithmic tilted-time window framework (TTWF). Our extensive experimental analysis shows (i) the capability of AnyClus in handling variable stream speeds (upto 250k objects/second); (ii) its ability to produce micro-clusters of high purity (≈1) and compactness; (iii) effectiveness of AnyRTree in handling noise, capturing concept drift and preservation of spatial locality in the indexing of micro-clusters, when compared to the existing methods. We also propose a parallel framework, Any-MP-Clus, for anytime clustering of multiport data streams over commodity clusters. Any-MP-Clus uses AnyRTree at each computing node of the cluster (for each stream-port) and maintains the aggregated micro-clusters in TTWF. The experimental results on datasets of billions scale show that Any-MP-Clus is scalable, efficient and produces clustering of higher quality. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Taylor & Francis | en_US |
dc.subject | Stream data mining | en_US |
dc.subject | Computer Science | en_US |
dc.subject | Anytime Mining | en_US |
dc.subject | Multiport streams | en_US |
dc.subject | Clustering streaming data | en_US |
dc.title | Anytime clustering of data streams while handling noise and concept drift | en_US |
dc.type | Article | en_US |
Appears in Collections: | Department of Computer Science and Information Systems |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.