Please use this identifier to cite or link to this item:
http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/16376
Title: | Sentiment Analysis of Dravidian Code Mixed Data |
Authors: | Sharma, Yashvardhan |
Keywords: | Computer Science Long short term memory (LSTM) Forum of Information Retrieval Evaluation |
Issue Date: | 2021 |
Publisher: | Association for Computational Linguistics |
Abstract: | This paper presents the methodologies implemented while classifying Dravidian code-mixed comments according to their polarity. With datasets of code-mixed Tamil and Malayalam available, three methods are proposed - a sub-word level model, a word embedding based model and a machine learning based architecture. The sub-word and word embedding based models utilized Long Short Term Memory (LSTM) network along with language-specific preprocessing while the machine learning model used term frequency–inverse document frequency (TF-IDF) vectorization along with a Logistic Regression model. The sub-word level model was submitted to the the track ‘Sentiment Analysis for Dravidian Languages in Code-Mixed Text’ proposed by Forum of Information Retrieval Evaluation in 2020 (FIRE 2020). Although it received a rank of 5 and 12 for the Tamil and Malayalam tasks respectively in the FIRE 2020 track, this paper improves upon the results by a margin to attain final weighted F1-scores of 0.65 for the Tamil task and 0.68 for the Malayalam task. The former score is equivalent to that attained by the highest ranked team of the Tamil track. |
URI: | https://aclanthology.org/2021.dravidianlangtech-1.6/ http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/16376 |
Appears in Collections: | Department of Computer Science and Information Systems |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.