AraLegal-BERT: A pretrained language model for Arabic Legal text

AL-Qurishi, Muhammad; AlQaseemi, Sarah; Soussi, Riad

Computer Science > Computation and Language

arXiv:2210.08284 (cs)

[Submitted on 15 Oct 2022]

Title:AraLegal-BERT: A pretrained language model for Arabic Legal text

Authors:Muhammad AL-Qurishi, Sarah AlQaseemi, Riad Soussi

View PDF

Abstract:The effectiveness of the BERT model on multiple linguistic tasks has been well documented. On the other hand, its potentials for narrow and specific domains such as Legal, have not been fully explored. In this paper, we examine how BERT can be used in the Arabic legal domain and try customizing this language model for several downstream tasks using several different domain-relevant training and testing datasets to train BERT from scratch. We introduce the AraLegal-BERT, a bidirectional encoder Transformer-based model that have been thoroughly tested and carefully optimized with the goal to amplify the impact of NLP-driven solution concerning jurisprudence, legal documents, and legal practice. We fine-tuned AraLegal-BERT and evaluated it against three BERT variations for Arabic language in three natural languages understanding (NLU) tasks. The results show that the base version of AraLegal-BERT achieve better accuracy than the general and original BERT over the Legal text.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2210.08284 [cs.CL]
	(or arXiv:2210.08284v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.08284

Submission history

From: Muhammad Al-Qurishi Dr [view email]
[v1] Sat, 15 Oct 2022 13:08:40 UTC (62 KB)

Computer Science > Computation and Language

Title:AraLegal-BERT: A pretrained language model for Arabic Legal text

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:AraLegal-BERT: A pretrained language model for Arabic Legal text

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators