A Review of the Marathi Natural Language Processing

Dani, Asang; Sathe, Shailesh R

Computer Science > Computation and Language

arXiv:2412.15471 (cs)

[Submitted on 20 Dec 2024 (v1), last revised 24 Dec 2024 (this version, v2)]

Title:A Review of the Marathi Natural Language Processing

Authors:Asang Dani, Shailesh R Sathe

View PDF

Abstract:Marathi is one of the most widely used languages in the world. One might expect that the latest advances in NLP research in languages like English reach such a large community. However, NLP advancements in English didn't immediately reach Indian languages like Marathi. There were several reasons for this. They included diversity of scripts used, lack of (publicly available) resources like tokenization strategies, high quality datasets \& benchmarks, and evaluation metrics. In addition to this, the morphologically rich nature of Marathi, made NLP tasks challenging. Advances in Neural Network (NN) based models and tools since the early 2000s helped improve this situation and make NLP research more accessible. In the past 10 years, significant efforts were made to improve language resources for all 22 scheduled languages of India. This paper presents a broad overview of evolution of NLP research in Indic languages with a focus on Marathi and state-of-the-art resources and tools available to the research community. It also provides an overview of tools \& techniques associated with Marathi NLP tasks.

Subjects:	Computation and Language (cs.CL)
ACM classes:	I.2.7
Cite as:	arXiv:2412.15471 [cs.CL]
	(or arXiv:2412.15471v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2412.15471

Submission history

From: Asang Dani [view email]
[v1] Fri, 20 Dec 2024 00:56:13 UTC (318 KB)
[v2] Tue, 24 Dec 2024 13:33:51 UTC (318 KB)

Computer Science > Computation and Language

Title:A Review of the Marathi Natural Language Processing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Review of the Marathi Natural Language Processing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators