Role-RL: Online Long-Context Processing with Role Reinforcement Learning for Distinct LLMs in Their Optimal Roles

He, Lewei; Shi, Tianyu; Huang, Pengran; Chen, Bingzhi; Chen, Qianglong; Pan, Jiahui

Computer Science > Artificial Intelligence

arXiv:2409.18014 (cs)

[Submitted on 26 Sep 2024]

Title:Role-RL: Online Long-Context Processing with Role Reinforcement Learning for Distinct LLMs in Their Optimal Roles

Authors:Lewei He, Tianyu Shi, Pengran Huang, Bingzhi Chen, Qianglong Chen, Jiahui Pan

View PDF HTML (experimental)

Abstract:Large language models (LLMs) with long-context processing are still challenging because of their implementation complexity, training efficiency and data sparsity. To address this issue, a new paradigm named Online Long-context Processing (OLP) is proposed when we process a document of unlimited length, which typically occurs in the information reception and organization of diverse streaming media such as automated news reporting, live e-commerce, and viral short videos. Moreover, a dilemma was often encountered when we tried to select the most suitable LLM from a large number of LLMs amidst explosive growth aiming for outstanding performance, affordable prices, and short response delays. In view of this, we also develop Role Reinforcement Learning (Role-RL) to automatically deploy different LLMs in their respective roles within the OLP pipeline according to their actual performance. Extensive experiments are conducted on our OLP-MINI dataset and it is found that OLP with Role-RL framework achieves OLP benchmark with an average recall rate of 93.2% and the LLM cost saved by 79.4%. The code and dataset are publicly available at: this https URL.

Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2409.18014 [cs.AI]
	(or arXiv:2409.18014v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2409.18014

Submission history

From: Lewei He [view email]
[v1] Thu, 26 Sep 2024 16:22:59 UTC (7,591 KB)

Computer Science > Artificial Intelligence

Title:Role-RL: Online Long-Context Processing with Role Reinforcement Learning for Distinct LLMs in Their Optimal Roles

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Role-RL: Online Long-Context Processing with Role Reinforcement Learning for Distinct LLMs in Their Optimal Roles

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators