A Comparative Analysis of Large Language Models for Code Documentation Generation

dc.contributor.authorKumar, Dhruv
dc.date.accessioned2024-08-12T10:47:23Z
dc.date.available2024-08-12T10:47:23Z
dc.date.issued2023-12
dc.description.abstractThis paper presents a comprehensive comparative analysis of Large Language Models (LLMs) for generation of code documentation. Code documentation is an essential part of the software writing process. The paper evaluates models such as GPT-3.5, GPT-4, Bard, Llama2, and Starchat on various parameters like Accuracy, Completeness, Relevance, Understandability, Readability and Time Taken for different levels of code documentation. Our evaluation employs a checklist-based system to minimize subjectivity, providing a more objective assessment. We find that, barring Starchat, all LLMs consistently outperform the original documentation. Notably, closed-source models GPT-3.5, GPT-4, and Bard exhibit superior performance across various parameters compared to open-source/source-available LLMs, namely LLama 2 and StarChat. Considering the time taken for generation, GPT-4 demonstrated the longest duration, followed by Llama2, Bard, with ChatGPT and Starchat having comparable generation times. Additionally, file level documentation had a considerably worse performance across all parameters (except for time taken) as compared to inline and function level documentation.en_US
dc.identifier.urihttps://arxiv.org/abs/2312.10349
dc.identifier.urihttp://dspace.bits-pilani.ac.in:8080/jspui/xmlui/handle/123456789/15212
dc.language.isoenen_US
dc.subjectComputer Scienceen_US
dc.subjectLarge Language Models (LLMs)en_US
dc.subjectGPT-3.5en_US
dc.titleA Comparative Analysis of Large Language Models for Code Documentation Generationen_US
dc.typePreprinten_US

Files

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: