Computer Science > Computational Complexity
[Submitted on 29 Mar 2024]
Title:A New Information Complexity Measure for Multi-pass Streaming with Applications
View PDFAbstract:We introduce a new notion of information complexity for multi-pass streaming problems and use it to resolve several important questions in data streams.
In the coin problem, one sees a stream of $n$ i.i.d. uniform bits and one would like to compute the majority with constant advantage. We show that any constant pass algorithm must use $\Omega(\log n)$ bits of memory, significantly extending an earlier $\Omega(\log n)$ bit lower bound for single-pass algorithms of Braverman-Garg-Woodruff (FOCS, 2020). This also gives the first $\Omega(\log n)$ bit lower bound for the problem of approximating a counter up to a constant factor in worst-case turnstile streams for more than one pass.
In the needle problem, one either sees a stream of $n$ i.i.d. uniform samples from a domain $[t]$, or there is a randomly chosen needle $\alpha \in[t]$ for which each item independently is chosen to equal $\alpha$ with probability $p$, and is otherwise uniformly random in $[t]$. The problem of distinguishing these two cases is central to understanding the space complexity of the frequency moment estimation problem in random order streams. We show tight multi-pass space bounds for this problem for every $p < 1/\sqrt{n \log^3 n}$, resolving an open question of Lovett and Zhang (FOCS, 2023); even for $1$-pass our bounds are new. To show optimality, we improve both lower and upper bounds from existing results.
Our information complexity framework significantly extends the toolkit for proving multi-pass streaming lower bounds, and we give a wide number of additional streaming applications of our lower bound techniques, including multi-pass lower bounds for $\ell_p$-norm estimation, $\ell_p$-point query and heavy hitters, and compressed sensing problems.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.