DSpace Repository

Anytime Frequent Itemset Mining of Transactional Data Streams

Show simple item record

dc.contributor.author Goyal, Poonam
dc.contributor.author Goyal, Navneet
dc.contributor.author Challa, Jagat Sesh
dc.date.accessioned 2022-12-27T06:31:56Z
dc.date.available 2022-12-27T06:31:56Z
dc.date.issued 2020-09
dc.identifier.uri https://www.sciencedirect.com/science/article/pii/S2214579620300149
dc.identifier.uri http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8146
dc.description.abstract Mining frequent itemsets from transactional data streams has become very essential in today's world with many applications such as stock market analysis, retail chain analysis, web log analysis, etc. Various algorithms have been proposed to efficiently mine single-port and multi-port transactional streams within the constraints of limited time and memory. However, all of them are budget algorithms, i.e., they are not capable of handling varying inter-arrival rate of transactions and high speed streams. They are constrained by a maximum limit to the inter-arrival rate of transactions, beyond which they fail to process. Also, these algorithms are not capable of giving immediate mining results, even with compromised accuracy if required. The above two properties characterize an anytime algorithm. We propose AnyFI, which is the first anytime frequent itemset mining algorithm for data streams. AnyFI uses a novel data structure - BFI-forest, which is capable of handling transactions arriving at variable rate. It maintains itemsets in BFI-forest in such a way that it can give a mining result almost immediately when the time allowance to mine is very less and can refine its accuracy with increase in time allowance. We also propose MPAnyFI which extends AnyFI into a parallel framework for anytime frequent itemset mining of multi-port data streams over commodity clusters. It uses AnyFI at each computing node of the cluster. Our extensive experimental analysis shows that AnyFI can handle high stream speeds close to 60,000 trans/sec with recall close to 100%. They also show the efficiency of MPAnyFI. en_US
dc.language.iso en en_US
dc.publisher Elsevier en_US
dc.subject Computer Science en_US
dc.subject Data Streams en_US
dc.subject Frequent Itemset Mining en_US
dc.subject Anytime frequent itemset mining en_US
dc.title Anytime Frequent Itemset Mining of Transactional Data Streams en_US
dc.type Article en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account