LexGPT 0.1: pre-trained GPT-J models with Pile of Law

Lee, Jieh-Sheng

Computer Science > Computation and Language

arXiv:2306.05431 (cs)

[Submitted on 5 Jun 2023]

Title:LexGPT 0.1: pre-trained GPT-J models with Pile of Law

Authors:Jieh-Sheng Lee

View PDF

Abstract:This research aims to build generative language models specialized for the legal domain. The manuscript presents the development of LexGPT models based on GPT-J models and pre-trained with Pile of Law. The foundation model built in this manuscript is the initial step for the development of future applications in the legal domain, such as further training with reinforcement learning from human feedback. Another objective of this manuscript is to assist legal professionals in utilizing language models through the ``No Code'' approach. By fine-tuning models with specialized data and without modifying any source code, legal professionals can create custom language models for downstream tasks with minimum effort and technical knowledge. The downstream task in this manuscript is to turn a LexGPT model into a classifier, although the performance is notably lower than the state-of-the-art result. How to enhance downstream task performance without modifying the model or its source code is a research topic for future exploration.

Comments:	10 pages and 2 figures. To be published in the Proceedings of the Seventeenth International Workshop on Juris-informatics (JURISIN 2023), hosted by JSAI International Symposia on AI 2023
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2306.05431 [cs.CL]
	(or arXiv:2306.05431v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2306.05431

Submission history

From: Jieh-Sheng Lee [view email]
[v1] Mon, 5 Jun 2023 08:42:59 UTC (1,259 KB)

Computer Science > Computation and Language

Title:LexGPT 0.1: pre-trained GPT-J models with Pile of Law

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:LexGPT 0.1: pre-trained GPT-J models with Pile of Law

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators