Offensive Language Classification of Code-Mixed Tamil with Keras

dc.contributor.authorSharma, Yashvardhan
dc.date.accessioned2024-11-14T10:41:10Z
dc.date.available2024-11-14T10:41:10Z
dc.date.issued2021
dc.description.abstractThis paper presents the method adopted for completing Task 1 of Dravidian-CodeMix-HASOC (Hate Speech and Offensive Content Identification in English and Indo-European Languages) Shared Task proposed by the Forum of Information Retrieval Evaluation in 2021, for offensive language detection. For detecting offensive language, a custom model architecture using convolutional neural networks was created using Keras for supervised learning, and trained on a dataset of YouTube comments, written in code-mixed Tamil in both Roman and Tamil scripts. The 5 layer neural network was built only using Keras, and required simple tokenized data, padded to an appropriate length. Recurrent neural networks and transfer learning were not used, and an F-score of 0.835 was achieved with the created CNN model.en_US
dc.identifier.urihttps://ceur-ws.org/Vol-3159/T3-14.pdf
dc.identifier.urihttps://dspace.bits-pilani.ac.in/handle/123456789/16381
dc.language.isoenen_US
dc.publisherCEUR-WSen_US
dc.subjectComputer Scienceen_US
dc.subjectOffensive language detectionen_US
dc.subjectCode-Mixed texten_US
dc.subjectTamilen_US
dc.subjectHASOCen_US
dc.titleOffensive Language Classification of Code-Mixed Tamil with Kerasen_US
dc.typeArticleen_US

Files

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: