InkFM: A Foundational Model for Full-Page Online Handwritten Note Understanding

Fadeeva, Anastasiia; Coriou, Vincent; Antognini, Diego; Musat, Claudiu; Maksai, Andrii

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.23081 (cs)

[Submitted on 29 Mar 2025]

Title:InkFM: A Foundational Model for Full-Page Online Handwritten Note Understanding

Authors:Anastasiia Fadeeva, Vincent Coriou, Diego Antognini, Claudiu Musat, Andrii Maksai

View PDF HTML (experimental)

Abstract:Tablets and styluses are increasingly popular for taking notes. To optimize this experience and ensure a smooth and efficient workflow, it's important to develop methods for accurately interpreting and understanding the content of handwritten digital notes. We introduce a foundational model called InkFM for analyzing full pages of handwritten content. Trained on a diverse mixture of tasks, this model offers a unique combination of capabilities: recognizing text in 28 different scripts, mathematical expressions recognition, and segmenting pages into distinct elements like text and drawings. Our results demonstrate that these tasks can be effectively unified within a single model, achieving SoTA text line segmentation out-of-the-box quality surpassing public baselines like docTR. Fine- or LoRA-tuning our base model on public datasets further improves the quality of page segmentation, achieves state-of the art text recognition (DeepWriting, CASIA, SCUT, and Mathwriting datasets) and sketch classification (QuickDraw). This adaptability of InkFM provides a powerful starting point for developing applications with handwritten input.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2503.23081 [cs.CV]
	(or arXiv:2503.23081v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.23081

Submission history

From: Anastasiia Fadeeva [view email]
[v1] Sat, 29 Mar 2025 13:45:24 UTC (2,964 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:InkFM: A Foundational Model for Full-Page Online Handwritten Note Understanding

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:InkFM: A Foundational Model for Full-Page Online Handwritten Note Understanding

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators