Decoding the End-to-end Writing Trajectory in Scholarly Manuscripts

Koo, Ryan; Martin, Anna; Wang, Linghe; Kang, Dongyeop

Abstract:Scholarly writing presents a complex space that generally follows a methodical procedure to plan and produce both rationally sound and creative compositions. Recent works involving large language models (LLM) demonstrate considerable success in text generation and revision tasks; however, LLMs still struggle to provide structural and creative feedback on the document level that is crucial to academic writing. In this paper, we introduce a novel taxonomy that categorizes scholarly writing behaviors according to intention, writer actions, and the information types of the written data. We also provide ManuScript, an original dataset annotated with a simplified version of our taxonomy to show writer actions and the intentions behind them. Motivated by cognitive writing theory, our taxonomy for scientific papers includes three levels of categorization in order to trace the general writing flow and identify the distinct writer activities embedded within each higher-level process. ManuScript intends to provide a complete picture of the scholarly writing process by capturing the linearity and non-linearity of writing trajectory, such that writing assistants can provide stronger feedback and suggestions on an end-to-end level. The collected writing trajectories are viewed at this https URL

Subjects:	Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2304.00121 [cs.CL]
	(or arXiv:2304.00121v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2304.00121

Computer Science > Computation and Language

Title:Decoding the End-to-end Writing Trajectory in Scholarly Manuscripts

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators