On the effectiveness of LLMs for automatic grading of open-ended questions in Spanish

Capdehourat, Germán; Amigo, Isabel; Lorenzo, Brian; Trigo, Joaquín

Computer Science > Computation and Language

arXiv:2503.18072 (cs)

[Submitted on 23 Mar 2025]

Title:On the effectiveness of LLMs for automatic grading of open-ended questions in Spanish

Authors:Germán Capdehourat, Isabel Amigo, Brian Lorenzo, Joaquín Trigo

View PDF HTML (experimental)

Abstract:Grading is a time-consuming and laborious task that educators must face. It is an important task since it provides feedback signals to learners, and it has been demonstrated that timely feedback improves the learning process. In recent years, the irruption of LLMs has shed light on the effectiveness of automatic grading. In this paper, we explore the performance of different LLMs and prompting techniques in automatically grading short-text answers to open-ended questions. Unlike most of the literature, our study focuses on a use case where the questions, answers, and prompts are all in Spanish. Experimental results comparing automatic scores to those of human-expert evaluators show good outcomes in terms of accuracy, precision and consistency for advanced LLMs, both open and proprietary. Results are notably sensitive to prompt styles, suggesting biases toward certain words or content in the prompt. However, the best combinations of models and prompt strategies, consistently surpasses an accuracy of 95% in a three-level grading task, which even rises up to more than 98% when the it is simplified to a binary right or wrong rating problem, which demonstrates the potential that LLMs have to implement this type of automation in education applications.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2503.18072 [cs.CL]
	(or arXiv:2503.18072v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2503.18072

Submission history

From: German Capdehourat [view email]
[v1] Sun, 23 Mar 2025 13:43:27 UTC (1,646 KB)

Computer Science > Computation and Language

Title:On the effectiveness of LLMs for automatic grading of open-ended questions in Spanish

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:On the effectiveness of LLMs for automatic grading of open-ended questions in Spanish

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators