Hate Speech Detection in Marathi and Code-Mixed Languages using TF-IDF and Transformers-Based BERT-Variants

Sharma, Yashvardhan

Please use this identifier to cite or link to this item: http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/16366

Title:	Hate Speech Detection in Marathi and Code-Mixed Languages using TF-IDF and Transformers-Based BERT-Variants
Authors:	Sharma, Yashvardhan
Keywords:	Computer Science BERT-Variants Hate Speech Cyber hate Social media HASOC Transformers model Multilingual BERT Machine learning (ML)
Issue Date:	2022
Publisher:	CEUR-WS
Abstract:	People now express their ideas on social media on a global scale. Online attacks against others can be made without fear of repercussions due to the increased sense of freedom provided by the anonymity feature, which eventually leads to the spread of hate speech. The current attempts to filter online information and stop the propagation of hatred are insufficient. Regional languages’ popularity on social media and the lack of hate speech detectors that can be used in multiple languages are two aspects that contribute to this. This paper discusses two aspects of fake news detection namely: Identification of Conversational Hate-Speech in Code-Mixed Languages like Hindi, English and German, while second part discusses about Offensive Language Identification in Marathi. Our approach uses TF-IDF word embedding combined with Machine Learning models and transformer based BERT models for the classification of hate speech in each of the two sub tasks. The MuRIL-BERT model produces the best results, with an accuracy of 73.1% and a Macro-F1 score of 0.727 for the code-mixed language and a macro F1-score of 0.8306 on Marathi data, which is 6% more from previous year.
URI:	https://www.bibsonomy.org/bibtex/1179151f1b137332bbf571f0882070142 http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/16366
Appears in Collections:	Department of Computer Science and Information Systems

Files in This Item:

There are no files associated with this item.

Show full item record