DSpace Repository

Topical document clustering: two-stage post processing technique

Show simple item record

dc.contributor.author Goyal, Poonam
dc.contributor.author Goyal, Navneet
dc.date.accessioned 2022-12-26T10:19:52Z
dc.date.available 2022-12-26T10:19:52Z
dc.date.issued 2018
dc.identifier.uri https://ideas.repec.org/a/ids/ijdmmm/v10y2018i2p127-170.html
dc.identifier.uri http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8136
dc.description.abstract Clustering documents is an essential step in improving efficiency and effectiveness of information retrieval systems. We propose a two-phase split-merge (SM) algorithm, which can be applied to topical clusters obtained from existing query-context-aware document clustering algorithms, to produce soft topical document clusters. The SM is a post-processing technique which combines the advantages of document and feature-pivot topical document clustering approaches. The split phase splits the topical clusters by relating them to the topics obtained by disambiguating web search results, and converts them into homogeneous soft clusters. In the merge phase, similar clusters are merged by feature-pivot approach. The SM is tested on the outcome of two hierarchical query-context aware document clustering algorithms on different datasets including TREC session-track 2011 dataset. The obtained topical-clusters are also updated by an incremental approach with the progress in the data stream. The proposed algorithm improves the quality of clustering appreciably in all the experiments conducted. en_US
dc.language.iso en en_US
dc.publisher Inder Science en_US
dc.subject Computer Science en_US
dc.subject Clustering en_US
dc.title Topical document clustering: two-stage post processing technique en_US
dc.type Article en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account