SentiCon: A Concept Based Feature Set for Sentiment Analysis

Mitra, Satanik

dc.contributor.author	Mitra, Satanik
dc.date.accessioned	2024-05-21T10:24:27Z
dc.date.available	2024-05-21T10:24:27Z
dc.date.issued	2018
dc.identifier.uri	https://ieeexplore.ieee.org/abstract/document/8721408
dc.identifier.uri	http://dspace.bits-pilani.ac.in:8080/jspui/xmlui/handle/123456789/14961
dc.description.abstract	Selection and extraction of appropriate numerical features to do sentiment analysis on text data with greater accuracy remain an open problem. In supervised machine learning based sentiment analysis, Term Frequency- Inverse Document Frequency (TF-IDF) scores are used as a feature for classifying polarity of text data. TF-IDF features are a high dimensional representation of the importance of a word in the document. TF-IDF features are sparse and do not consider the correlation among the words which constructs the latent concepts in the document. Latent Semantic Analysis (LSA) removes sparseness of the TF-IDF features by representing it in a low dimensional matrix and extracts those hidden concepts. On the other hand, a natural property of text document is its information content. The quantitative estimation of Parts-of-Speech tags, negation words, sentiment lexicons etc. represent the quality of information shared in a text data. In this work, we propose an approach to generate a concept based domain specific feature set SentiCon by consolidating LSA with the quality of information of the corpus. We have applied Singular Value Decomposition (SVD) on TF-IDF features to find the LSA. We have tested SentiCon with two benchmark datasets IMDB movie review and Epinion Cars, Books datasets using four well-known classifiers - Decision Tree, Random Forrest, Support Vector Machine, and K-Nearest Neighbour classifiers. We have used standard performance measures precision, recall and F-measure to analyze the results.	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.subject	Management	en_US
dc.subject	Feature Extraction	en_US
dc.subject	Sentiment analysis	en_US
dc.subject	Machine learning	en_US
dc.subject	Sparse matrices	en_US
dc.subject	Semantics	en_US
dc.title	SentiCon: A Concept Based Feature Set for Sentiment Analysis	en_US
dc.type	Article	en_US

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Department of Management [436]

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

SentiCon: A Concept Based Feature Set for Sentiment Analysis

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account