DSpace logo

Please use this identifier to cite or link to this item: http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/16387
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSharma, Yashvardhan-
dc.date.accessioned2024-11-14T11:21:35Z-
dc.date.available2024-11-14T11:21:35Z-
dc.date.issued2020-
dc.identifier.urihttps://ceur-ws.org/Vol-2826/T2-32.pdf-
dc.identifier.urihttp://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/16387-
dc.description.abstractDetecting and eliminating offensive and hate speech in social media content is an important concern as hate and offensive speech can have serious consequences in society ranging from ill-education among youth to hate crimes. Offensive speech identification in countries like India poses several additional challenges due to the usage of code-mixed and romanized variants of multiple languages by the users in their posts on social media. HASOC-Dravidian-CodeMix - FIRE 2020 extended the task of offensive speech identification to Dravidian languages. In this paper, we describe our approach in HASOC Dravidian Code-mixed 2020, which topped two out of three tasks(F1-weighted scores - 0.95 and 0.90) and stood second in the third task lagging the top model only by 0.01 points((F1-weighted score - 0.77). We propose a novel and flexible approach of selective translation and transliteration to be able to reap better results out of fine-tuning and ensembling multilingual transformer networks like XLM-RoBERTa and mBERT. Further, we implemented pre-trained, fine-tuned and ensembled versions of XLM-RoBERTa for offensive speech classification. We open source our work to facilitate further experimentation.en_US
dc.language.isoenen_US
dc.publisherCEUR-WSen_US
dc.subjectComputer Scienceen_US
dc.subjectOffensive speech detectionen_US
dc.subjectSelective translation and transliterationen_US
dc.subjectXLM-RoBERTaen_US
dc.subjectTransformer Neural Networksen_US
dc.titleSiva@ HASOC-Dravidian-CodeMix-FIRE-2020: Multilingual Offensive Speech Detection in Code-mixed and Romanized Texten_US
dc.typeArticleen_US
Appears in Collections:Department of Computer Science and Information Systems

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.