Department of Computer Science and Information Systems
Permanent URI for this collectionhttp://localhost:4000/handle/123456789/1928
Browse
8 results
Search Results
Item Character aware models with similarity learning for metaphor detection(Association for Computational Linguistics (ACL), 2020) Sharma, YashvardhanRecent work on automatic sequential metaphor detection has involved recurrent neural networks initialized with different pre-trained word embeddings and which are sometimes combined with hand engineered features. To capture lexical and orthographic information automatically, in this paper we propose to add character based word representation. Also, to contrast the difference between literal and contextual meaning, we utilize a similarity network. We explore these components via two different architectures - a BiLSTM model and a Transformer Encoder model similar to BERT to perform metaphor identification. We participate in the Second Shared Task on Metaphor Detection on both the VUA and TOFEL datasets with the above models. The experimental results demonstrate the effectiveness of our method as it outperforms all the systems which participated in the previous shared task.Item Ranking based Question Answering System with a Web and Mobile Application(IEEE, 2021) Sharma, YashvardhanThis work attempts to comprehend the fundamental working of the deep learning models that have been proposed till now in order to arrive at an inventive and accurate ensemble model for handling questions which are either unanswerable or have answers as persistent content from the context. In general cases of Natural Language Processing applications, Long Short-Term Memory and Gated Recurrent Units have shown great performance. With the introduction of Convolutional Neural Networks (which were conventionally used for performing image analysis and object/landmark detection) in the domain of text analysis along with the latest Attention mechanisms substantial progress in this domain was observed. On the SQuAD dataset, these models can learn to train on all the possible varieties of questions that may exist. An ensemble model based on the BERT encoder was also implemented for the Machine Reading Comprehension task.This work presents a ranking based question answering system. The complete system was divided into three modules namely Extension, Question selection Web Service and QA system. Later a chrome extension and a mobile app is developed which presents the above approach.Item Visual Question-Answering System Using Integrated Models of Image Captioning and BERT(Taylor & Francis, 2021) Sharma, YashvardhanVisual question and answering (VQA) is a task that involves taking input as an image and a natural question about it to generate output of an answer to that question. This is a multidisciplinary problem: it includes problems faced in computer vision and natural language processing. This chapter uses a combination of network architectures of question answering (BERT) and image captioning (BUTD, show-and-tell model, CaptionBot, and show, attend, and tell model) models for VQA tasks. The chapter also highlights the comparison between these four VQA models.Item Applying TF-IDF and BERT-based Variants under Multilabel Classification for Emotion Detection in Urdu Language(CEUR-WS, 2022) Sharma, YashvardhanNowadays, the use of emojis is very common to show our emotions with just a single image instead of long sentences describing our emotions. Each emoji describes a particular emotion, such as anger, disgust, fear, sadness, surprise, and happiness. Now if we are given a task to identify emotions in a text, that means we have to tag a text with multiple emojis, each pointing to a different emotion. This paper aims to check for multiple emotions in an Urdu text, which comes under the category of multi-label classification. We have used pre-trained BERT models to add basic knowledge about a language (Urdu in our case). Over the pre-trained model, we added the classification layer using PyTorch. The output layer has seven nodes, six of which are for six emotions, and the seventh is for neutral. FIRE 2022 provided the Urdu tweet dataset used here as part of the subtask ”Multi-label emotion classification in Urdu” of the main task ”Emothreat: Emotion and Threat detection in Urdu.”Item Context-Based Question Answering System with Suggested Questions(IEEE, 2022) Sharma, YashvardhanQuestion Answering and Question Generation are well-researched problems in the field of Natural Language Processing and Information Retrieval. This paper aims to demonstrate the use of novel transformer-based models like BERT, AIBERT, and DistilBERT for Question Answering System and the t5 model for Question Generation. The Question Generation task is integrated with the Question Answering System to suggest relevant questions from the input context using the transfer learning-based model. The question generation model generates questions from the context input by the user and uses different models like DistilBERT, RoBERTa for getting answers from the context. Suggested questions are ranked using BM25 scores to show the most relevant question-answer pairs on the topItem BITS Pilani at HinglishEval: Quality Evaluation for Code-Mixed Hinglish Text Using Transformers(Association for Computational Linguistics, 2022) Sharma, YashvardhanCode-Mixed text data consists of sentences having words or phrases from more than one language. Most multi-lingual communities worldwide communicate using multiple languages, with English usually one of them. Hinglish is a Code-Mixed text composed of Hindi and English but written in Roman script. This paper aims to determine the factors influencing the quality of Code-Mixed text data generated by the system. For the HinglishEval task, the proposed model uses multilingual BERT to find the similarity between synthetically generated and human-generated sentences to predict the quality of synthetically generated Hinglish sentences.Item Rhetorical Role Labeling of Legal Documents using Transformers and Graph Neural Networks(2023-05) Sharma, YashvardhanA legal document is usually long and dense requiring human effort to parse it. It also contains significant amounts of jargon which make deriving insights from it using existing models a poor approach. This paper presents the approaches undertaken to perform the task of rhetorical role labelling on Indian Court Judgements as part of SemEval Task 6: understanding legal texts, shared subtask A. We experiment with graph based approaches like Graph Convolutional Networks and Label Propagation Algorithm, and transformer-based approaches including variants of BERT to improve accuracy scores on text classification of complex legal documents.Item Multilingual chatbot for Indian languages(IEEE, 2023) Sharma, Yashvardhan; Bhatia, Ashutosh; Tiwari, KamleshChatbots are user-friendly interfaces that emulate human dialogue. With the rise of technologies such as Artificial Intelligence (AI) and Natural Language Processing (NLP), chatbots have become an effective tool in most conversational applications of companies. India is a multiverse country that demands making the chatbot functional in different languages. We present an effective approach to building a multilingual chatbot in Indian languages for fixed-response questions. This technique omits the expensive machine translation task with a large run-time overhead. We implement the chatbot's functionality to answer the query in context by fine-tuning the transformer model to the downstream task of question-answering. The MuRIL BERT model provides the best results for correct response prediction among major multilingual BERT models