Leveraging RLHF with Instruction Fine-tuning for Improving LLM Response in Bangla Conversations

dc.contributor.authorRahman, Tahmid
dc.contributor.authorMahmud, Shahriar
dc.contributor.authorNasrum, Nur
dc.date.accessioned2025-06-03T04:57:36Z
dc.date.available2025-06-03T04:57:36Z
dc.date.issued2024-11-30
dc.descriptionSupervised by Dr. Hasan Mahmud, Associate Professor, Dr. Md. Kamrul Hasan, Professor, Department of Computer Science and Engineering (CSE) Islamic University of Technology (IUT) Board Bazar, Gazipur, Bangladesh This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Computer Science and Engineering, 2024en_US
dc.description.abstractIn the realm of Bangla conversational agents, this research endeavors to elevate the responsiveness of Large Language Models (LLMs) through the synergistic ap plication of Reinforcement Learning from Human Feedback (RLHF) and instruc tion fine-tuning. The primary objectives encompass the creation of a domain specific Bangla conversational dataset, an evaluation of existing LLMs using an instruction-tuned dataset, and the introduction of a novel human-centric bench marking framework. This work uses a multi-step process to improve the effi cacy of Large Language Models (LLMs) in the context of Bangla Conversational Agents. The technique consists of many key steps, each of which contributes to the refining and optimization of model performance. To address the shortage of domain-specific datasets for Bangla Conversational Agents, we start by construct ing a Bangla Conversational Dataset. The dataset is then fine-tuned with the use of an Instruction-Tuned Format. This structuring makes the data more suitable for training language models, allowing them to better comprehend and respond to precise commands in the Bangla conversational environment. Existing LLMs go through the next phase of our procedure, Supervised Fine-Tuning (SFT), us ing the instruction-tuned dataset. This fine-tuning procedure guarantees that the models are tailored to the variety and complexity of Bangla talks, maximizing their performance in accordance with the dataset’s unique instructions. Following fine-tuning, we do a detailed examination and comparison of the LLM models. This stage gives insight into the effectiveness of the fine-tuned models and enables the selection of the most promising candidate. This iterative procedure entails modifying the model with human feedback to improve its performance in a more dynamic and sophisticated way.en_US
dc.identifier.citation[1] J. M. Liu, D. Li, H. Cao, T. Ren, Z. Liao, and J. Wu, “Chatcounselor: A large language models for mental health support,” arXiv preprint arXiv:2309.15461, 2023. [2] D. Bill and T. Eriksson, “Fine-tuning a llm using reinforcement learning from human feedback for a therapy chatbot application,” 2023. [3] S. Zhang, L. Dong, X. Li, S. Zhang, X. Sun, S. Wang, J. Li, R. Hu, T. Zhang, F. Wu et al., “Instruction tuning for large language models: A survey,” arXiv preprint arXiv:2308.10792, 2023.en_US
dc.identifier.urihttp://hdl.handle.net/123456789/2414
dc.language.isoenen_US
dc.publisherDepartment of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladeshen_US
dc.subjectnt Learning from Hu man Feedback (RLHF), Instruction fine-tuning,Human-centric benchmarking,Supervised Fine-Tuning (SFT)en_US
dc.titleLeveraging RLHF with Instruction Fine-tuning for Improving LLM Response in Bangla Conversationsen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
Fulltext_ CSE_190041237_190041118_190041106_Book - Shahriar Mahmud 190041118.pdf
Size:
2.56 MB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
Signature Shhet_ CSE_190041237_190041118_190041106_Signatures - Shahriar Mahmud 190041118.pdf
Size:
2.9 MB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
Plagarism Report_ CSE_190041237_190041118_190041106_- Shahriar Mahmud 190041118 (1).pdf
Size:
1.82 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections