Unlocking the Potential of Multiple BERT Models for Bangla Question Answering in NCTB Textbooks

Khondoker, Abdullah; Taufik, Enam Ahmed; Tashik, Md Iftekhar Islam; mahmud, S M Ishtiak; Parsa, Antara Firoz

Abstract:Evaluating text comprehension in educational settings is critical for understanding student performance and improving curricular effectiveness. This study investigates the capability of state-of-the-art language models-RoBERTa Base, Bangla-BERT, and BERT Base-in automatically assessing Bangla passage-based question-answering from the National Curriculum and Textbook Board (NCTB) textbooks for classes 6-10. A dataset of approximately 3,000 Bangla passage-based question-answering instances was compiled, and the models were evaluated using F1 Score and Exact Match (EM) metrics across various hyperparameter configurations. Our findings revealed that Bangla-BERT consistently outperformed the other models, achieving the highest F1 (0.75) and EM (0.53) scores, particularly with smaller batch sizes, the inclusion of stop words, and a moderate learning rate. In contrast, RoBERTa Base demonstrated the weakest performance, with the lowest F1 (0.19) and EM (0.27) scores under certain configurations. The results underscore the importance of fine-tuning hyperparameters for optimizing model performance and highlight the potential of machine learning models in evaluating text comprehension in educational contexts. However, limitations such as dataset size, spelling inconsistencies, and computational constraints emphasize the need for further research to enhance the robustness and applicability of these models. This study lays the groundwork for the future development of automated evaluation systems in educational institutions, providing critical insights into model performance in the context of Bangla text comprehension.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2412.18440 [cs.CL]
	(or arXiv:2412.18440v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2412.18440

Computer Science > Computation and Language

Title:Unlocking the Potential of Multiple BERT Models for Bangla Question Answering in NCTB Textbooks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators