OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning

Ye, Rui; Wang, Wenhao; Chai, Jingyi; Li, Dihan; Li, Zexi; Xu, Yinda; Du, Yaxin; Wang, Yanfeng; Chen, Siheng

Computer Science > Machine Learning

arXiv:2402.06954 (cs)

[Submitted on 10 Feb 2024]

Title:OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning

Authors:Rui Ye, Wenhao Wang, Jingyi Chai, Dihan Li, Zexi Li, Yinda Xu, Yaxin Du, Yanfeng Wang, Siheng Chen

View PDF HTML (experimental)

Abstract:Trained on massive publicly available data, large language models (LLMs) have demonstrated tremendous success across various fields. While more data contributes to better performance, a disconcerting reality is that high-quality public data will be exhausted in a few years. In this paper, we offer a potential next step for contemporary LLMs: collaborative and privacy-preserving LLM training on the underutilized distributed private data via federated learning (FL), where multiple data owners collaboratively train a shared model without transmitting raw data. To achieve this, we build a concise, integrated, and research-friendly framework/codebase, named OpenFedLLM. It covers federated instruction tuning for enhancing instruction-following capability, federated value alignment for aligning with human values, and 7 representative FL algorithms. Besides, OpenFedLLM supports training on diverse domains, where we cover 8 training datasets; and provides comprehensive evaluations, where we cover 30+ evaluation metrics. Through extensive experiments, we observe that all FL algorithms outperform local training on training LLMs, demonstrating a clear performance improvement across a variety of settings. Notably, in a financial benchmark, Llama2-7B fine-tuned by applying any FL algorithm can outperform GPT-4 by a significant margin while the model obtained through individual training cannot, demonstrating strong motivation for clients to participate in FL. The code is available at this https URL.

Comments:	28 pages, 3 figures, 16 tables
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA)
Cite as:	arXiv:2402.06954 [cs.LG]
	(or arXiv:2402.06954v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2402.06954

Submission history

From: Rui Ye [view email]
[v1] Sat, 10 Feb 2024 13:50:11 UTC (481 KB)

Computer Science > Machine Learning

Title:OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators