Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems

Klisura, Đorđe; Rios, Anthony

Computer Science > Computation and Language

arXiv:2406.14545 (cs)

[Submitted on 20 Jun 2024 (v1), last revised 17 Oct 2024 (this version, v2)]

Title:Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems

Authors:Đorđe Klisura, Anthony Rios

View PDF HTML (experimental)

Abstract:Text-to-SQL systems empower users to interact with databases using natural language, automatically translating queries into executable SQL code. However, their reliance on database schema information for SQL generation exposes them to significant security vulnerabilities, particularly schema inference attacks that can lead to unauthorized data access or manipulation. In this paper, we introduce a novel zero-knowledge framework for reconstructing the underlying database schema of text-to-SQL models without any prior knowledge of the database. Our approach systematically probes text-to-SQL models with specially crafted questions and leverages a surrogate GPT-4 model to interpret the outputs, effectively uncovering hidden schema elements -- including tables, columns, and data types. We demonstrate that our method achieves high accuracy in reconstructing table names, with F1 scores of up to .99 for generative models and .78 for fine-tuned models, underscoring the severity of schema leakage risks. Furthermore, we propose a simple protection mechanism for generative models and empirically show its limitations in mitigating these attacks.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2406.14545 [cs.CL]
	(or arXiv:2406.14545v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.14545

Submission history

From: Anthony Rios [view email]
[v1] Thu, 20 Jun 2024 17:54:33 UTC (294 KB)
[v2] Thu, 17 Oct 2024 15:06:23 UTC (402 KB)

Computer Science > Computation and Language

Title:Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators