DSpace logo

Please use this identifier to cite or link to this item: http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/18959
Full metadata record
DC FieldValueLanguage
dc.contributor.authorChamola, Vinay-
dc.contributor.authorGupta, Karunesh Kumar-
dc.date.accessioned2025-05-20T09:11:07Z-
dc.date.available2025-05-20T09:11:07Z-
dc.date.issued2025-04-
dc.identifier.urihttps://ieeexplore.ieee.org/abstract/document/10964149-
dc.identifier.urihttp://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/18959-
dc.description.abstractBleurt a recently introduced metric that employs Bert, a potent pre-trained language model to assess how well candidate translations compare to a reference translation in the context of machine translation outputs. While traditional metrics like Bleu rely on lexical similarities, Bleurt leverages Bert's semantic and syntactic capabilities to provide more robust evaluation through complex text representations. However, studies have shown that Bert, despite its impressive performance in natural language processing tasks can sometimes deviate from human judgment, particularly in specific syntactic and semantic scenarios. Through systematic experimental analysis at the word level, including categorization of errors such as lexical mismatches, untranslated terms, and structural inconsistencies, we investigate how Bleurt handles various translation challenges. Our study addresses three central questions: What are the strengths and weaknesses of Bleurt, how do they align with Bert's known limitations, and how does it compare with the similar automatic neural metric for machine translation, BERTScore? Using manually annotated datasets that emphasize different error types and linguistic phenomena, we find that Bleurt excels at identifying nuanced differences between sentences with high overlap, an area where BERTScore shows limitations. Our systematic experiments, provide insights for their effective application in machine translation evaluation.en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.subjectEEEen_US
dc.subjectNatural Language Processing (NLP)en_US
dc.subjectDeep learningen_US
dc.subjectMachine learning (ML)en_US
dc.subjectMetricsen_US
dc.titleA detailed comparative analysis of automatic neural metrics for machine translation: bleurt & bertscoreen_US
dc.typeArticleen_US
Appears in Collections:Department of Electrical and Electronics Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.