AdQuestA: knowledge-guided visual question answer framework for advertisements

No Thumbnail Available

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Abstract

In the rapidly evolving landscape of digital marketing, effective customer engagement through advertisements is crucial for brands. Thus, computational understanding of ads is pivotal for recommendation, authoring, and customer behaviour simulation. Despite advancements in knowledge-guided visual-question-answering (VQA) models, existing frameworks often lack domain-specific responses and suffer from a dearth of benchmark datasets for advertisements. To address this gap, we introduce ADVQA, the first dataset for ad-related VQA sourced from Facebook and X (twitter), which facilitates further research in ad comprehension. It comprises open-ended questions and detailed context obtained automatically from web articles. Moreover, we present AdQuestA, a novel multimodal framework for knowledge-guided open-ended question-answering tailored to advertisements. AdQuestA leverages a Retrieval Augmented Generation (RAG) to obtain question-aware ad context as explicit knowledge and image-grounded implicit knowledge, effectively exploiting inherent relationships for reasoning. Extensive experiments corroborate its efficacy, yielding state-of-the-art performance on the AD-VQA dataset, even surpassing 10X larger models such as GPT-4 on this task. Our framework not only enhances understanding of ad content but also advances the broader landscape of knowledge-guided VQA models.

Description

Keywords

Computer Science, Visual question answering for advertisements (AdVQA), Digital marketing, Customer engagement

Citation

Endorsement

Review

Supplemented By

Referenced By