DSpace Repository

Hate Speech Detection in Marathi and Code-Mixed Languages using TF-IDF and Transformers-Based BERT-Variants

Show simple item record

dc.contributor.author Sharma, Yashvardhan
dc.date.accessioned 2024-11-14T06:20:57Z
dc.date.available 2024-11-14T06:20:57Z
dc.date.issued 2022
dc.identifier.uri https://www.bibsonomy.org/bibtex/1179151f1b137332bbf571f0882070142
dc.identifier.uri http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/16366
dc.description.abstract People now express their ideas on social media on a global scale. Online attacks against others can be made without fear of repercussions due to the increased sense of freedom provided by the anonymity feature, which eventually leads to the spread of hate speech. The current attempts to filter online information and stop the propagation of hatred are insufficient. Regional languages’ popularity on social media and the lack of hate speech detectors that can be used in multiple languages are two aspects that contribute to this. This paper discusses two aspects of fake news detection namely: Identification of Conversational Hate-Speech in Code-Mixed Languages like Hindi, English and German, while second part discusses about Offensive Language Identification in Marathi. Our approach uses TF-IDF word embedding combined with Machine Learning models and transformer based BERT models for the classification of hate speech in each of the two sub tasks. The MuRIL-BERT model produces the best results, with an accuracy of 73.1% and a Macro-F1 score of 0.727 for the code-mixed language and a macro F1-score of 0.8306 on Marathi data, which is 6% more from previous year. en_US
dc.language.iso en en_US
dc.publisher CEUR-WS en_US
dc.subject Computer Science en_US
dc.subject BERT-Variants en_US
dc.subject Hate Speech en_US
dc.subject Cyber hate en_US
dc.subject Social media en_US
dc.subject HASOC en_US
dc.subject Transformers model en_US
dc.subject Multilingual BERT en_US
dc.subject Machine learning (ML) en_US
dc.title Hate Speech Detection in Marathi and Code-Mixed Languages using TF-IDF and Transformers-Based BERT-Variants en_US
dc.type Article en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account