ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Zhong, Lucen; Du, Zhengxiao; Zhang, Xiaohan; Hu, Haiyi; Tang, Jie

Computer Science > Computation and Language

arXiv:2501.10132 (cs)

[Submitted on 17 Jan 2025]

Title:ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Authors:Lucen Zhong, Zhengxiao Du, Xiaohan Zhang, Haiyi Hu, Jie Tang

View PDF HTML (experimental)

Abstract:Enhancing large language models (LLMs) with real-time APIs can help generate more accurate and up-to-date responses. However, evaluating the function calling abilities of LLMs in real-world scenarios remains under-explored due to the complexity of data collection and evaluation. In this work, we introduce ComplexFuncBench, a benchmark for complex function calling across five real-world scenarios. Compared to existing benchmarks, ComplexFuncBench encompasses multi-step and constrained function calling, which requires long-parameter filing, parameter value reasoning, and 128k long context. Additionally, we propose an automatic framework, ComplexEval, for quantitatively evaluating complex function calling tasks. Through comprehensive experiments, we demonstrate the deficiencies of state-of-the-art LLMs in function calling and suggest future directions for optimizing these capabilities. The data and code are available at \url{this https URL}.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2501.10132 [cs.CL]
	(or arXiv:2501.10132v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2501.10132

Submission history

From: Lucen Zhong [view email]
[v1] Fri, 17 Jan 2025 11:41:53 UTC (1,517 KB)

Computer Science > Computation and Language

Title:ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators