DSpace Repository

Offensive Language Classification of Code-Mixed Tamil with Keras

Show simple item record

dc.contributor.author Sharma, Yashvardhan
dc.date.accessioned 2024-11-14T10:41:10Z
dc.date.available 2024-11-14T10:41:10Z
dc.date.issued 2021
dc.identifier.uri https://ceur-ws.org/Vol-3159/T3-14.pdf
dc.identifier.uri http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/16381
dc.description.abstract This paper presents the method adopted for completing Task 1 of Dravidian-CodeMix-HASOC (Hate Speech and Offensive Content Identification in English and Indo-European Languages) Shared Task proposed by the Forum of Information Retrieval Evaluation in 2021, for offensive language detection. For detecting offensive language, a custom model architecture using convolutional neural networks was created using Keras for supervised learning, and trained on a dataset of YouTube comments, written in code-mixed Tamil in both Roman and Tamil scripts. The 5 layer neural network was built only using Keras, and required simple tokenized data, padded to an appropriate length. Recurrent neural networks and transfer learning were not used, and an F-score of 0.835 was achieved with the created CNN model. en_US
dc.language.iso en en_US
dc.publisher CEUR-WS en_US
dc.subject Computer Science en_US
dc.subject Offensive language detection en_US
dc.subject Code-Mixed text en_US
dc.subject Tamil en_US
dc.subject HASOC en_US
dc.title Offensive Language Classification of Code-Mixed Tamil with Keras en_US
dc.type Article en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account