BITS-P at WAT 2023: Improving Indic Language Multimodal Translation by Image Augmentation using Diffusion Models

dc.contributor.authorSharma, Yashvardhan
dc.date.accessioned2024-11-12T08:29:25Z
dc.date.available2024-11-12T08:29:25Z
dc.date.issued2023
dc.description.abstractThis paper describes the proposed system for mutlimodal machine translation. We have participated in multimodal translation tasks for English into three Indic languages: Hindi, Bengali, and Malayalam. We leverage the inherent richness of multimodal data to bridge the gap of ambiguity in translation. We fine-tuned the ‘No Language Left Behind’ (NLLB) machine translation model for multimodal translation, further enhancing the model accuracy by image data augmentation using latent diffusion. Our submission achieves the best BLEU score for English-Hindi, English-Bengali, and English-Malayalam language pairs for both Evaluation and Challenge test sets.en_US
dc.identifier.urihttps://aclanthology.org/2023.wat-1.3/
dc.identifier.urihttps://dspace.bits-pilani.ac.in/handle/123456789/16342
dc.language.isoenen_US
dc.publisherAssociation for Computational Linguisticsen_US
dc.subjectComputer Scienceen_US
dc.subjectMachine Translation (MT)en_US
dc.subjectMultimodal Machine Translation (MMT)en_US
dc.subjectImage Augmentationen_US
dc.subjectMultimodal Machine Translation (MMT)en_US
dc.titleBITS-P at WAT 2023: Improving Indic Language Multimodal Translation by Image Augmentation using Diffusion Modelsen_US
dc.typeArticleen_US

Files

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: