SyllabusQA: A Course Logistics Question Answering Dataset

Fernandez, Nigel; Scarlatos, Alexander; Lan, Andrew

Computer Science > Computers and Society

arXiv:2403.14666 (cs)

[Submitted on 3 Mar 2024 (v1), last revised 22 Jul 2024 (this version, v2)]

Title:SyllabusQA: A Course Logistics Question Answering Dataset

Authors:Nigel Fernandez, Alexander Scarlatos, Andrew Lan

View PDF HTML (experimental)

Abstract:Automated teaching assistants and chatbots have significant potential to reduce the workload of human instructors, especially for logistics-related question answering, which is important to students yet repetitive for instructors. However, due to privacy concerns, there is a lack of publicly available datasets. We introduce SyllabusQA, an open-source dataset with 63 real course syllabi covering 36 majors, containing 5,078 open-ended course logistics-related question-answer pairs that are diverse in both question types and answer formats. Since many logistics-related questions contain critical information like the date of an exam, it is important to evaluate the factuality of answers. We benchmark several strong baselines on this task, from large language model prompting to retrieval-augmented generation. We introduce Fact-QA, an LLM-based (GPT-4) evaluation metric to evaluate the factuality of predicted answers. We find that despite performing close to humans on traditional metrics of textual similarity, there remains a significant gap between automated approaches and humans in terms of fact precision.

Comments:	ACL 2024: The 62nd Annual Meeting of the Association for Computational Linguistics
Subjects:	Computers and Society (cs.CY); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Cite as:	arXiv:2403.14666 [cs.CY]
	(or arXiv:2403.14666v2 [cs.CY] for this version)
	https://doi.org/10.48550/arXiv.2403.14666

Submission history

From: Nigel Fernandez [view email]
[v1] Sun, 3 Mar 2024 03:01:14 UTC (531 KB)
[v2] Mon, 22 Jul 2024 20:37:55 UTC (532 KB)

Computer Science > Computers and Society

Title:SyllabusQA: A Course Logistics Question Answering Dataset

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computers and Society

Title:SyllabusQA: A Course Logistics Question Answering Dataset

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators