Sentiment Analysis of Dravidian Code Mixed Data

Sharma, Yashvardhan

Please use this identifier to cite or link to this item: http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/16376

Title:	Sentiment Analysis of Dravidian Code Mixed Data
Authors:	Sharma, Yashvardhan
Keywords:	Computer Science Long short term memory (LSTM) Forum of Information Retrieval Evaluation
Issue Date:	2021
Publisher:	Association for Computational Linguistics
Abstract:	This paper presents the methodologies implemented while classifying Dravidian code-mixed comments according to their polarity. With datasets of code-mixed Tamil and Malayalam available, three methods are proposed - a sub-word level model, a word embedding based model and a machine learning based architecture. The sub-word and word embedding based models utilized Long Short Term Memory (LSTM) network along with language-specific preprocessing while the machine learning model used term frequency–inverse document frequency (TF-IDF) vectorization along with a Logistic Regression model. The sub-word level model was submitted to the the track ‘Sentiment Analysis for Dravidian Languages in Code-Mixed Text’ proposed by Forum of Information Retrieval Evaluation in 2020 (FIRE 2020). Although it received a rank of 5 and 12 for the Tamil and Malayalam tasks respectively in the FIRE 2020 track, this paper improves upon the results by a margin to attain final weighted F1-scores of 0.65 for the Tamil task and 0.68 for the Malayalam task. The former score is equivalent to that attained by the highest ranked team of the Tamil track.
URI:	https://aclanthology.org/2021.dravidianlangtech-1.6/ http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/16376
Appears in Collections:	Department of Computer Science and Information Systems

Files in This Item:

There are no files associated with this item.

Show full item record