Monday, January 19, 2026 - 13:00
The School of Computer Science is pleased to present…
Evaluating Large Language Models to Support Software Engineering Tasks
MSc Thesis Defense by: Nafisha Binte Moin
Date: 19th January, 2026
Time: 1:00 pm
Location: Lambton Tower Room 3105
Abstract: Large Language Models (LLMs) are increasingly used in software engineering (SE) tasks such as test generation and commit analysis, yet questions remain regarding the effectiveness of retrieval-augmented generation (RAG) techniques and the reliability of LLM outputs in practice. This research addresses these gaps through two complementary studies on automated unit test generation and bug-fix commit annotation. First, we study LLM-based unit test generation, evaluating four prompt strategies and integrating sparse (BM25, BM25L) and dense (SBERT-based FAISS, LSH, ANNOY, and HNSW) retrievers within a RAG framework. Results show that a few-shot instructional prompt achieves the highest correctness (99% pass rate) and branch coverage (72.56%). RAG further improves test robustness and diversity, with dense retrievers, particularly SBERT with HNSW, consistently outperforming sparse approaches. Compared to Pynguin, LLM-generated tests are more executable, structured, and semantically meaningful. Second, we evaluate LLMs as automated annotators for bug-fix commit identification across six GitHub repositories comprising over 23,000 commits. Experiments with GPT-4o and Claude 4.5 Sonnet under zero-shot, few-shot, and RAG configurations show that RAG-enhanced models achieve the best performance, with F1-scores between 0.70 and 0.85 and recall up to 0.95, substantially outperforming keyword-based heuristics. Overall, our results demonstrate that RAG-enhanced LLMs offer an effective and scalable solution for improving both unit test generation and bug-fix commit annotation in real-world SE settings. Datasets and implementations are publicly available.
Keywords: Large Language Models, Software Engineering, Unit Test Generation, Retrieval-Augmented Generation, Prompt Engineering
Thesis Committee:
Defense Chair: Dr. Ikjot Saini
Reader 1: Dr. Jessica Chen
Reader 2: Dr. Dan Wu
Advisor: Dr. Muhammad Asaduzzaman
