ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Wang, Yubo; Ma, Xueguang; Nie, Ping; Zeng, Huaye; Lyu, Zhiheng; Zhang, Yuxuan; Schneider, Benjamin; Lu, Yi; Yue, Xiang; Chen, Wenhu

Computer Science > Computation and Language

arXiv:2504.00824v1 (cs)

[Submitted on 1 Apr 2025 (this version), latest version 3 Apr 2025 (v2)]

Title:ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Authors:Yubo Wang, Xueguang Ma, Ping Nie, Huaye Zeng, Zhiheng Lyu, Yuxuan Zhang, Benjamin Schneider, Yi Lu, Xiang Yue, Wenhu Chen

View PDF HTML (experimental)

Abstract:Academic writing requires both coherent text generation and precise citation of relevant literature. Although recent Retrieval-Augmented Generation (RAG) systems have significantly improved factual accuracy in general-purpose text generation, their capacity to adequately support professional academic writing remains limited. In this work, we introduce ScholarCopilot, a unified framework designed to enhance existing large language models for generating professional academic articles with accurate and contextually relevant citations. ScholarCopilot dynamically determines when to retrieve scholarly references by generating a retrieval token [RET], and then utilizes its representation to look up relevant citations from a database. The retrieved references are fed into the model to augment the generation process. We jointly optimize both the generation and citation tasks within a single framework to increase efficiency. Trained on 500K papers from arXiv, our model achieves a top-1 retrieval accuracy of 40.1% on our evaluation dataset, outperforming baselines such as E5-Mistral-7B-Instruct (15.0%) and BM25 (9.8%). On a dataset of 1,000 academic writing samples, ScholarCopilot scores 16.2/25 in generation quality (measured across relevance, coherence, academic rigor, completeness, and innovation), surpassing models with 10x more parameters such as Qwen-2.5-72B-Instruct (15.8/25). Human studies also confirm ScholarCopilot's superior performance in citation recall, writing efficiency, and overall user experience, confirming the effectiveness of our approach.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2504.00824 [cs.CL]
	(or arXiv:2504.00824v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2504.00824

Submission history

From: Yubo Wang [view email]
[v1] Tue, 1 Apr 2025 14:12:14 UTC (5,392 KB)
[v2] Thu, 3 Apr 2025 15:07:29 UTC (5,392 KB)

Computer Science > Computation and Language

Title:ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators