Partitioned Learned Bloom Filter

Vaidya, Kapil; Knorr, Eric; Kraska, Tim; Mitzenmacher, Michael

Computer Science > Data Structures and Algorithms

arXiv:2006.03176 (cs)

[Submitted on 5 Jun 2020 (v1), last revised 4 Oct 2020 (this version, v2)]

Title:Partitioned Learned Bloom Filter

Authors:Kapil Vaidya, Eric Knorr, Tim Kraska, Michael Mitzenmacher

View PDF

Abstract:Bloom filters are space-efficient probabilistic data structures that are used to test whether an element is a member of a set, and may return false positives. Recently, variations referred to as learned Bloom filters were developed that can provide improved performance in terms of the rate of false positives, by using a learned model for the represented set. However, previous methods for learned Bloom filters do not take full advantage of the learned model. Here we show how to frame the problem of optimal model utilization as an optimization problem, and using our framework derive algorithms that can achieve near-optimal performance in many cases. Experimental results from both simulated and real-world datasets show significant performance improvements from our optimization approach over both the original learned Bloom filter constructions and previously proposed heuristic improvements.

Comments:	13 pages, 3 figures
Subjects:	Data Structures and Algorithms (cs.DS); Databases (cs.DB); Machine Learning (cs.LG)
Cite as:	arXiv:2006.03176 [cs.DS]
	(or arXiv:2006.03176v2 [cs.DS] for this version)
	https://doi.org/10.48550/arXiv.2006.03176

Submission history

From: Kapil Vaidya [view email]
[v1] Fri, 5 Jun 2020 00:05:32 UTC (610 KB)
[v2] Sun, 4 Oct 2020 15:15:17 UTC (903 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.DS

< prev | next >

new | recent | 2020-06

Change to browse by:

cs
cs.DB
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Kapil Vaidya
Tim Kraska
Michael Mitzenmacher

export BibTeX citation

Computer Science > Data Structures and Algorithms

Title:Partitioned Learned Bloom Filter

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Data Structures and Algorithms

Title:Partitioned Learned Bloom Filter

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators