Faster Compression of Deterministic Finite Automata

Bille, Philip; Gørtz, Inge Li; Pedersen, Max Rishøj

Computer Science > Data Structures and Algorithms

arXiv:2306.12771v1 (cs)

[Submitted on 22 Jun 2023 (this version), latest version 4 Sep 2024 (v3)]

Title:Faster Compression of Deterministic Finite Automata

Authors:Philip Bille, Inge Li Gørtz, Max Rishøj Pedersen

View PDF

Abstract:Deterministic finite automata (DFA) are a classic tool for high throughput matching of regular expressions, both in theory and practice.
Due to their high space consumption, extensive research has been devoted to compressed representations of DFAs that still support efficient pattern matching queries.
Kumar~et~al.~[SIGCOMM 2006] introduced the \emph{delayed deterministic finite automaton} (\ddfa{}) which exploits the large redundancy between inter-state transitions in the automaton.
They showed it to obtain up to two orders of magnitude compression of real-world DFAs, and their work formed the basis of numerous subsequent results.
Their algorithm, as well as later algorithms based on their idea, have an inherent quadratic-time bottleneck, as they consider every pair of states to compute the optimal compression.
In this work we present a simple, general framework based on locality-sensitive hashing for speeding up these algorithms to achieve sub-quadratic construction times for \ddfa{}s.
We apply the framework to speed up several algorithms to near-linear time, and experimentally evaluate their performance on real-world regular expression sets extracted from modern intrusion detection systems.
We find an order of magnitude improvement in compression times, with either little or no loss of compression, or even significantly better compression in some cases.

Subjects:	Data Structures and Algorithms (cs.DS)
Cite as:	arXiv:2306.12771 [cs.DS]
	(or arXiv:2306.12771v1 [cs.DS] for this version)
	https://doi.org/10.48550/arXiv.2306.12771

Submission history

From: Philip Bille [view email]
[v1] Thu, 22 Jun 2023 09:51:40 UTC (2,151 KB)
[v2] Fri, 19 Jan 2024 14:28:40 UTC (3,135 KB)
[v3] Wed, 4 Sep 2024 10:40:12 UTC (3,142 KB)

🚨2024-09-29: arxiv.org is experience DB issues. The announce tonight will be 3 hours later than usual.🚨

Computer Science > Data Structures and Algorithms

Title:Faster Compression of Deterministic Finite Automata

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

🚨2024-09-29: arxiv.org is experience DB issues. The announce tonight will be 3 hours later than usual.🚨

Computer Science > Data Structures and Algorithms

Title:Faster Compression of Deterministic Finite Automata

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators