Applying TF-IDF and BERT-based Variants under Multilabel Classification for Emotion Detection in Urdu Language
No Thumbnail Available
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
CEUR-WS
Abstract
Nowadays, the use of emojis is very common to show our emotions with just a single image instead
of long sentences describing our emotions. Each emoji describes a particular emotion, such as anger,
disgust, fear, sadness, surprise, and happiness. Now if we are given a task to identify emotions in a text,
that means we have to tag a text with multiple emojis, each pointing to a different emotion. This paper
aims to check for multiple emotions in an Urdu text, which comes under the category of multi-label
classification. We have used pre-trained BERT models to add basic knowledge about a language (Urdu in
our case). Over the pre-trained model, we added the classification layer using PyTorch. The output layer
has seven nodes, six of which are for six emotions, and the seventh is for neutral. FIRE 2022 provided
the Urdu tweet dataset used here as part of the subtask ”Multi-label emotion classification in Urdu” of
the main task ”Emothreat: Emotion and Threat detection in Urdu.”
Description
Keywords
Computer Science, Social media, UrduHack, BERT, Multi-label classification, Negative weight, Positive weight, Transformers model, Text classification