DSpace Repository

Visual Question-Answering System Using Integrated Models of Image Captioning and BERT

Show simple item record

dc.contributor.author Sharma, Yashvardhan
dc.date.accessioned 2024-11-14T09:22:46Z
dc.date.available 2024-11-14T09:22:46Z
dc.date.issued 2021
dc.identifier.uri https://www.taylorfrancis.com/chapters/edit/10.1201/9781003102380-9/visual-question-answering-system-using-integrated-models-image-captioning-bert-lavika-goel-mohit-dhawan-rachit-rathore-satyansh-rai-aaryan-kapoor-yashvardhan-sharma
dc.identifier.uri http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/16373
dc.description.abstract Visual question and answering (VQA) is a task that involves taking input as an image and a natural question about it to generate output of an answer to that question. This is a multidisciplinary problem: it includes problems faced in computer vision and natural language processing. This chapter uses a combination of network architectures of question answering (BERT) and image captioning (BUTD, show-and-tell model, CaptionBot, and show, attend, and tell model) models for VQA tasks. The chapter also highlights the comparison between these four VQA models. en_US
dc.language.iso en en_US
dc.publisher Taylor & Francis en_US
dc.subject Computer Science en_US
dc.subject Visual question and answering (VQA) en_US
dc.subject BERT en_US
dc.subject BUTD en_US
dc.title Visual Question-Answering System Using Integrated Models of Image Captioning and BERT en_US
dc.type Article en_US


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account