From Words to Code: Harnessing Data for Program Synthesis from Natural Language

Khatry, Anirudh; Cahoon, Joyce; Henkel, Jordan; Deep, Shaleen; Emani, Venkatesh; Floratou, Avrilia; Gulwani, Sumit; Le, Vu; Raza, Mohammad; Shi, Sherry; Singh, Mukul; Tiwari, Ashish

Computer Science > Databases

arXiv:2305.01598 (cs)

[Submitted on 2 May 2023 (v1), last revised 3 May 2023 (this version, v2)]

Title:From Words to Code: Harnessing Data for Program Synthesis from Natural Language

Authors:Anirudh Khatry, Joyce Cahoon, Jordan Henkel, Shaleen Deep, Venkatesh Emani, Avrilia Floratou, Sumit Gulwani, Vu Le, Mohammad Raza, Sherry Shi, Mukul Singh, Ashish Tiwari

View PDF

Abstract:Creating programs to correctly manipulate data is a difficult task, as the underlying programming languages and APIs can be challenging to learn for many users who are not skilled programmers. Large language models (LLMs) demonstrate remarkable potential for generating code from natural language, but in the data manipulation domain, apart from the natural language (NL) description of the intended task, we also have the dataset on which the task is to be performed, or the "data context". Existing approaches have utilized data context in a limited way by simply adding relevant information from the input data into the prompts sent to the LLM.
In this work, we utilize the available input data to execute the candidate programs generated by the LLMs and gather their outputs. We introduce semantic reranking, a technique to rerank the programs generated by LLMs based on three signals coming the program outputs: (a) semantic filtering and well-formedness based score tuning: do programs even generate well-formed outputs, (b) semantic interleaving: how do the outputs from different candidates compare to each other, and (c) output-based score tuning: how do the outputs compare to outputs predicted for the same task. We provide theoretical justification for semantic interleaving. We also introduce temperature mixing, where we combine samples generated by LLMs using both high and low temperatures. We extensively evaluate our approach in three domains, namely databases (SQL), data science (Pandas) and business intelligence (Excel's Power Query M) on a variety of new and existing benchmarks. We observe substantial gains across domains, with improvements of up to 45% in top-1 accuracy and 34% in top-3 accuracy.

Comments:	14 pages
Subjects:	Databases (cs.DB); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2305.01598 [cs.DB]
	(or arXiv:2305.01598v2 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.2305.01598

Submission history

From: Anirudh Khatry [view email]
[v1] Tue, 2 May 2023 16:56:32 UTC (1,585 KB)
[v2] Wed, 3 May 2023 07:02:57 UTC (1,560 KB)

Computer Science > Databases

Title:From Words to Code: Harnessing Data for Program Synthesis from Natural Language

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:From Words to Code: Harnessing Data for Program Synthesis from Natural Language

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators