MIRROR: A Novel Approach for the Automated Evaluation of Open-Ended Question Generation

Deroy, Aniket; Maity, Subhankar; Sarkar, Sudeshna

Computer Science > Computation and Language

arXiv:2410.12893 (cs)

[Submitted on 16 Oct 2024]

Title:MIRROR: A Novel Approach for the Automated Evaluation of Open-Ended Question Generation

Authors:Aniket Deroy, Subhankar Maity, Sudeshna Sarkar

View PDF HTML (experimental)

Abstract:Automatic question generation is a critical task that involves evaluating question quality by considering factors such as engagement, pedagogical value, and the ability to stimulate critical thinking. These aspects require human-like understanding and judgment, which automated systems currently lack. However, human evaluations are costly and impractical for large-scale samples of generated questions. Therefore, we propose a novel system, MIRROR (Multi-LLM Iterative Review and Response for Optimized Rating), which leverages large language models (LLMs) to automate the evaluation process for questions generated by automated question generation systems. We experimented with several state-of-the-art LLMs, such as GPT-4, Gemini, and Llama2-70b. We observed that the scores of human evaluation metrics, namely relevance, appropriateness, novelty, complexity, and grammaticality, improved when using the feedback-based approach called MIRROR, tending to be closer to the human baseline scores. Furthermore, we observed that Pearson's correlation coefficient between GPT-4 and human experts improved when using our proposed feedback-based approach, MIRROR, compared to direct prompting for evaluation. Error analysis shows that our proposed approach, MIRROR, significantly helps to improve relevance and appropriateness.

Comments:	Accepted at FM-Eduassess @ NEURIPS 2024 (ORAL Paper)
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2410.12893 [cs.CL]
	(or arXiv:2410.12893v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2410.12893

Submission history

From: Aniket Deroy [view email]
[v1] Wed, 16 Oct 2024 12:24:42 UTC (824 KB)

Computer Science > Computation and Language

Title:MIRROR: A Novel Approach for the Automated Evaluation of Open-Ended Question Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:MIRROR: A Novel Approach for the Automated Evaluation of Open-Ended Question Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators