Combined Search and Encoding for Seeds, with an Application to Minimal Perfect Hashing

Lehmann, Hans-Peter; Sanders, Peter; Walzer, Stefan; Ziegler, Jonatan

Computer Science > Data Structures and Algorithms

arXiv:2502.05613 (cs)

[Submitted on 8 Feb 2025]

Title:Combined Search and Encoding for Seeds, with an Application to Minimal Perfect Hashing

Authors:Hans-Peter Lehmann, Peter Sanders, Stefan Walzer, Jonatan Ziegler

View PDF HTML (experimental)

Abstract:Randomised algorithms often employ methods that can fail and that are retried with independent randomness until they succeed. Randomised data structures therefore often store indices of successful attempts, called seeds. If $n$ such seeds are required (e.g., for independent substructures) the standard approach is to compute for each $i \in [n]$ the smallest successful seed $S_i$ and store $\vec{S} = (S_1, \ldots, S_n)$.
The central observation of this paper is that this is not space-optimal. We present a different algorithm that computes a sequence $\vec{S}' = (S_1', \ldots, S_n')$ of successful seeds such that the entropy of $\vec{S'}$ undercuts the entropy of $\vec{S}$ by $\Omega(n)$ bits in most cases. To achieve a memory consumption of $\mathrm{OPT}+\varepsilon n$, the expected number of inspected seeds increases by a factor of $O(1/\varepsilon)$.
We demonstrate the usefulness of our findings with a novel construction for minimal perfect hash functions with space requirement $(1+\varepsilon)\mathrm{OPT}$. The construction time is $O(n/\varepsilon)$ while all previous approaches have construction times that increase exponentially with $1/\varepsilon$. Our implementation beats the construction throughput of the state of the art by up to two orders of magnitude.

Subjects:	Data Structures and Algorithms (cs.DS)
Cite as:	arXiv:2502.05613 [cs.DS]
	(or arXiv:2502.05613v1 [cs.DS] for this version)
	https://doi.org/10.48550/arXiv.2502.05613

Submission history

From: Hans-Peter Lehmann [view email]
[v1] Sat, 8 Feb 2025 15:41:58 UTC (503 KB)

Computer Science > Data Structures and Algorithms

Title:Combined Search and Encoding for Seeds, with an Application to Minimal Perfect Hashing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Data Structures and Algorithms

Title:Combined Search and Encoding for Seeds, with an Application to Minimal Perfect Hashing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators