Please use this identifier to cite or link to this item:
http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/20618Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Ray, Saumi | - |
| dc.date.accessioned | 2026-01-22T11:07:35Z | - |
| dc.date.available | 2026-01-22T11:07:35Z | - |
| dc.date.issued | 2025-12 | - |
| dc.identifier.uri | https://arxiv.org/abs/2512.15312 | - |
| dc.identifier.uri | http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/20618 | - |
| dc.description.abstract | Extracting structured information from zeolite synthesis experimental procedures is critical for materials discovery, yet existing methods have not systematically evaluated Large Language Models (LLMs) for this domain-specific task. This work addresses a fundamental question: what is the efficacy of different prompting strategies when applying LLMs to scientific information extraction? We focus on four key subtasks: event type classification (identifying synthesis steps), trigger text identification (locating event mentions), argument role extraction (recognizing parameter types), and argument text extraction (extracting parameter values). We evaluate four prompting strategies - zero-shot, few-shot, event-specific, and reflection-based - across six state-of-the-art LLMs (Gemma-3-12b-it, GPT-5-mini, O4-mini, Claude-Haiku-3.5, DeepSeek reasoning and non-reasoning) using the ZSEE dataset of 1,530 annotated sentences. Results demonstrate strong performance on event type classification (80-90\% F1) but modest performance on fine-grained extraction tasks, particularly argument role and argument text extraction (50-65\% F1). GPT-5-mini exhibits extreme prompt sensitivity with 11-79\% F1 variation. Notably, advanced prompting strategies provide minimal improvements over zero-shot approaches, revealing fundamental architectural limitations. Error analysis identifies systematic hallucination, over-generalization, and inability to capture synthesis-specific nuances. Our findings demonstrate that while LLMs achieve high-level understanding, precise extraction of experimental parameters requires domain-adapted models, providing quantitative benchmarks for scientific information extraction. | en_US |
| dc.language.iso | en | en_US |
| dc.subject | Chemistry | en_US |
| dc.subject | Zeolite synthesis extraction | en_US |
| dc.subject | Large language models (LLMs) | en_US |
| dc.subject | Prompting strategies | en_US |
| dc.subject | Scientific information extraction | en_US |
| dc.title | Evaluating LLMs for zeolite synthesis event extraction (ZSEE) : a systematic analysis of prompting strategies | en_US |
| dc.type | Preprint | en_US |
| Appears in Collections: | Department of Chemistry | |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.