BITS Faculty Publications
Permanent URI for this communityhttp://localhost:4000/handle/123456789/1867
Browse
8 results
Search Results
Item The role of generative AI tools in shaping mechanical engineering education from an undergraduate perspective(Springer Nature, 2025-03) Challa, Jagat Sesh; Kumar, DhruvThis study evaluates the effectiveness of three leading generative AI tools-ChatGPT, Gemini, and Copilot-in undergraduate mechanical engineering education using a mixed-methods approach. The performance of these tools was assessed on 800 questions spanning seven core subjects, covering multiple-choice, numerical, and theory-based formats. While all three AI tools demonstrated strong performance in theory-based questions, they struggled with numerical problem-solving, particularly in areas requiring deep conceptual understanding and complex calculations. Among them, Copilot achieved the highest accuracy (60.38%), followed by Gemini (57.13%) and ChatGPT (46.63%). To complement these findings, a survey of 172 students and interviews with 20 participants provided insights into user experiences, challenges, and perceptions of AI in academic settings. Thematic analysis revealed concerns regarding AI’s reliability in numerical tasks and its potential impact on students’ problem-solving abilities. Based on these results, this study offers strategic recommendations for integrating AI into mechanical engineering curricula, ensuring its responsible use to enhance learning without fostering dependency. Additionally, we propose instructional strategies to help educators adapt assessment methods in the era of AI-assisted learning. These findings contribute to the broader discussion on AI’s role in engineering education and its implications for future learning methodologies.Item Rubric is all you need: enhancing llm-based code evaluation with question-specific rubrics(2025-03) Challa, Jagat Sesh; Kumar, DhruvSince the disruption in LLM technology brought about by the release of GPT-3 and ChatGPT, LLMs have shown remarkable promise in programming-related tasks. While code generation remains a popular field of research, code evaluation using LLMs remains a problem with no conclusive solution. In this paper, we focus on LLM-based code evaluation and attempt to fill in the existing gaps. We propose multi-agentic novel approaches using question-specific rubrics tailored to the problem statement, arguing that these perform better for logical assessment than the existing approaches that use question-agnostic rubrics. To address the lack of suitable evaluation datasets, we introduce two datasets: a Data Structures and Algorithms dataset containing 150 student submissions from a popular Data Structures and Algorithms practice website, and an Object Oriented Programming dataset comprising 80 student submissions from undergraduate computer science courses. In addition to using standard metrics (Spearman Correlation, Cohen's Kappa), we additionally propose a new metric called as Leniency, which quantifies evaluation strictness relative to expert assessment. Our comprehensive analysis demonstrates that question-specific rubrics significantly enhance logical assessment of code in educational settings, providing better feedback aligned with instructional goals beyond mere syntactic correctnessItem From information overload to lucidity: a survey on leveraging gpts for systematic summarization of medical and biomedical artifacts(IEEE, 2024-12) Chalapathi, G.S.S.; Singh, Amit RajnarayanIn medical research, the rapid proliferation of condition-specific studies has led to an information overload, making it challenging for researchers and practitioners to stay abreast of the latest findings. This paper presents a comprehensive survey on leveraging Generative Pretrained Transformers (GPTs) to summarize medical and biomedical artifacts systematically. We delve into the current applications of GPTs in this domain, discussing their role in understanding and summarizing research papers, medical dialogues, and medical records. Through a comparative analysis of recent studies and methodologies, we highlight the effectiveness of GPTs in distilling complex medical information into concise, understandable summaries. Our survey underscores the potential of GPTs as a tool for navigating the information overload in medical research and bringing clarity to healthcare professionals. This transformation will enhance patient care and outcomes, such as improving the accessibility and comprehensibility of medical research, assisting in rapid information retrieval, and facilitating the summarization of complex medical studies for broader audiences.Item Generative AI for Transformative Healthcare: A Comprehensive Study of Emerging Models, Applications, Case Studies, and Limitations(IEEE, 2024-02) Chamola, VinayGenerative artificial intelligence (GAI) can be broadly described as an artificial intelligence system capable of generating images, text, and other media types with human prompts. GAI models like ChatGPT, DALL-E, and Bard have recently caught the attention of industry and academia equally. GAI applications span various industries like art, gaming, fashion, and healthcare. In healthcare, GAI shows promise in medical research, diagnosis, treatment, and patient care and is already making strides in real-world deployments. There has yet to be any detailed study concerning the applications and scope of GAI in healthcare. Addressing this research gap, we explore several applications, real-world scenarios, and limitations of GAI in healthcare. We examine how GAI models like ChatGPT and DALL-E can be leveraged to aid in the applications of medical imaging, drug discovery, personalized patient treatment, medical simulation and training, clinical trial optimization, mental health support, healthcare operations and research, medical chatbots, human movement simulation, and a few more applications. Along with applications, we cover four real-world healthcare scenarios that employ GAI: visual snow syndrome diagnosis, molecular drug optimization, medical education, and dentistry. We also provide an elaborate discussion on seven healthcare-customized LLMs like Med-PaLM, BioGPT, DeepHealth, etc.,Since GAI is still evolving, it poses challenges like the lack of professional expertise in decision making, risk of patient data privacy, issues in integrating with existing healthcare systems, and the problem of data bias which are elaborated on in this work along with several other challenges. We also put forward multiple directions for future research in GAI for healthcare.Item With Great Power Comes Great Responsibility!": Student and Instructor Perspectives on the influence of LLMs on Undergraduate Engineering Education(2023-09) Kumar, DhruvThe rise in popularity of Large Language Models (LLMs) has prompted discussions in academic circles, with students exploring LLM-based tools for coursework inquiries and instructors exploring them for teaching and research. Even though a lot of work is underway to create LLM-based tools tailored for students and instructors, there is a lack of comprehensive user studies that capture the perspectives of students and instructors regarding LLMs. This paper addresses this gap by conducting surveys and interviews within undergraduate engineering universities in India. Using 1306 survey responses among students, 112 student interviews, and 27 instructor interviews around the academic usage of ChatGPT (a popular LLM), this paper offers insights into the current usage patterns, perceived benefits, threats, and challenges, as well as recommendations for enhancing the adoption of LLMs among students and instructors. These insights are further utilized to discuss the practical implications of LLMs in undergraduate engineering education and beyond.Item “It's not like Jarvis, but it's pretty close!” - Examining ChatGPT's Usage among Undergraduate Students in Computer Science(ACM Digital Library, 2024-01) Kumar, DhruvLarge language models (LLMs) such as ChatGPT and Google Bard have garnered significant attention in the academic community. Previous research has evaluated these LLMs for various applications such as generating programming exercises and solutions. However, these evaluations have predominantly been conducted by instructors and researchers, not considering the actual usage of LLMs by students. This study adopts a student-first approach to comprehensively understand how undergraduate computer science students utilize ChatGPT, a popular LLM, released by OpenAI. We employ a combination of student surveys and interviews to obtain valuable insights into the benefits, challenges, and suggested improvements related to ChatGPT. Our findings suggest that a majority of students (over 57%) have a convincingly positive outlook towards adopting ChatGPT as an aid in coursework-related tasks. However, our research also highlights various challenges that must be resolved for long-term acceptance of ChatGPT amongst students. The findings from this investigation have broader implications and may be applicable to other LLMs and their role in computing education.Item ChatGPT in the Classroom: An Analysis of Its Strengths and Weaknesses for Solving Undergraduate Computer Science Questions(ACM Digital Library, 2024) Kumar, DhruvThis research paper aims to analyze the strengths and weaknesses associated with the utilization of ChatGPT as an educational tool in the context of undergraduate computer science education. ChatGPT's usage in tasks such as solving assignments and exams has the potential to undermine students' learning outcomes and compromise academic integrity. This study adopts a quantitative approach to demonstrate the notable unreliability of ChatGPT in providing accurate answers to a wide range of questions within the field of undergraduate computer science. While the majority of existing research has concentrated on assessing the performance of Large Language Models in handling programming assignments, our study adopts a more comprehensive approach. Specifically, we evaluate various types of questions such as true/false, multi-choice, multi-select, short answer, long answer, design-based, and coding-related questions. Our evaluation highlights the potential consequences of students excessively relying on ChatGPT for the completion of assignments and exams, including self-sabotage. We conclude with a discussion on how can students and instructors constructively use ChatGPT and related tools to enhance the quality of instruction and the overall student experience.Item Can ChatGPT Play the Role of a Teaching Assistant in an Introductory Programming Course?(2024-01) Kumar, DhruvThe emergence of Large language models (LLMs) is expected to have a major impact on education. This paper explores the potential of using ChatGPT, an LLM, as a virtual Teaching Assistant (TA) in an Introductory Programming Course. We evaluate ChatGPT's capabilities by comparing its performance with that of human TAs in some of the important TA functions. The TA functions which we focus on include (1) grading student code submissions, and (2) providing feedback to undergraduate students in an introductory programming course. Firstly, we assess ChatGPT's proficiency in grading student code submissions using a given grading rubric and compare its performance with the grades assigned by human TAs. Secondly, we analyze the quality and relevance of the feedback provided by ChatGPT. This evaluation considers how well ChatGPT addresses mistakes and offers suggestions for improvement in student solutions from both code correctness and code quality perspectives. We conclude with a discussion on the implications of integrating ChatGPT into computing education for automated grading, personalized learning experiences, and instructional support.