Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems

Varshney, Neeraj; Baral, Chitta

Computer Science > Computation and Language

arXiv:2210.05528 (cs)

[Submitted on 11 Oct 2022]

Title:Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems

Authors:Neeraj Varshney, Chitta Baral

View PDF

Abstract:Do all instances need inference through the big models for a correct prediction? Perhaps not; some instances are easy and can be answered correctly by even small capacity models. This provides opportunities for improving the computational efficiency of systems. In this work, we present an explorative study on 'model cascading', a simple technique that utilizes a collection of models of varying capacities to accurately yet efficiently output predictions. Through comprehensive experiments in multiple task settings that differ in the number of models available for cascading (K value), we show that cascading improves both the computational efficiency and the prediction accuracy. For instance, in K=3 setting, cascading saves up to 88.93% computation cost and consistently achieves superior prediction accuracy with an improvement of up to 2.18%. We also study the impact of introducing additional models in the cascade and show that it further increases the efficiency improvements. Finally, we hope that our work will facilitate development of efficient NLP systems making their widespread adoption in real-world applications possible.

Comments:	EMNLP 2022
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2210.05528 [cs.CL]
	(or arXiv:2210.05528v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.05528

Submission history

From: Neeraj Varshney [view email]
[v1] Tue, 11 Oct 2022 15:17:52 UTC (4,171 KB)

Computer Science > Computation and Language

Title:Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators