Computational Complexity
See recent articles
Showing new listings for Friday, 27 September 2024
- [1] arXiv:2409.17831 [pdf, other]
-
Title: Asymptotically Optimal Hardness for $k$-Set Packing and $k$-Matroid IntersectionComments: 14 pagesSubjects: Computational Complexity (cs.CC); Data Structures and Algorithms (cs.DS); Combinatorics (math.CO)
For any $\varepsilon > 0$, we prove that $k$-Dimensional Matching is hard to approximate within a factor of $k/(12 + \varepsilon)$ for large $k$ unless $\textsf{NP} \subseteq \textsf{BPP}$. Listed in Karp's 21 $\textsf{NP}$-complete problems, $k$-Dimensional Matching is a benchmark computational complexity problem which we find as a special case of many constrained optimization problems over independence systems including: $k$-Set Packing, $k$-Matroid Intersection, and Matroid $k$-Parity. For all the aforementioned problems, the best known lower bound was a $\Omega(k /\log(k))$-hardness by Hazan, Safra, and Schwartz. In contrast, state-of-the-art algorithms achieved an approximation of $O(k)$. Our result narrows down this gap to a constant and thus provides a rationale for the observed algorithmic difficulties. The crux of our result hinges on a novel approximation preserving gadget from $R$-degree bounded $k$-CSPs over alphabet size $R$ to $kR$-Dimensional Matching. Along the way, we prove that $R$-degree bounded $k$-CSPs over alphabet size $R$ are hard to approximate within a factor $\Omega_k(R)$ using known randomised sparsification methods for CSPs.
New submissions (showing 1 of 1 entries)
- [2] arXiv:2409.17250 (cross-list from cs.DS) [pdf, other]
-
Title: Kernelization Complexity of Solution Discovery ProblemsMario Grobler, Stephanie Maaz, Amer E. Mouawad, Naomi Nishimura, Vijayaragunathan Ramamoorthi, Sebastian SiebertzSubjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC); Combinatorics (math.CO)
In the solution discovery variant of a vertex (edge) subset problem $\Pi$ on graphs, we are given an initial configuration of tokens on the vertices (edges) of an input graph $G$ together with a budget $b$. The question is whether we can transform this configuration into a feasible solution of $\Pi$ on $G$ with at most $b$ modification steps. We consider the token sliding variant of the solution discovery framework, where each modification step consists of sliding a token to an adjacent vertex (edge). The framework of solution discovery was recently introduced by Fellows et al. [Fellows et al., ECAI 2023] and for many solution discovery problems the classical as well as the parameterized complexity has been established. In this work, we study the kernelization complexity of the solution discovery variants of Vertex Cover, Independent Set, Dominating Set, Shortest Path, Matching, and Vertex Cut with respect to the parameters number of tokens $k$, discovery budget $b$, as well as structural parameters such as pathwidth.
- [3] arXiv:2409.17567 (cross-list from cs.LG) [pdf, other]
-
Title: Derandomizing Multi-Distribution LearningSubjects: Machine Learning (cs.LG); Computational Complexity (cs.CC); Data Structures and Algorithms (cs.DS); Statistics Theory (math.ST)
Multi-distribution or collaborative learning involves learning a single predictor that works well across multiple data distributions, using samples from each during training. Recent research on multi-distribution learning, focusing on binary loss and finite VC dimension classes, has shown near-optimal sample complexity that is achieved with oracle efficient algorithms. That is, these algorithms are computationally efficient given an efficient ERM for the class. Unlike in classical PAC learning, where the optimal sample complexity is achieved with deterministic predictors, current multi-distribution learning algorithms output randomized predictors. This raises the question: can these algorithms be derandomized to produce a deterministic predictor for multiple distributions? Through a reduction to discrepancy minimization, we show that derandomizing multi-distribution learning is computationally hard, even when ERM is computationally efficient. On the positive side, we identify a structural condition enabling an efficient black-box reduction, converting existing randomized multi-distribution predictors into deterministic ones.
Cross submissions (showing 2 of 2 entries)
- [4] arXiv:2009.10677 (replaced) [pdf, html, other]
-
Title: On the Mysteries of MAX NAE-SATComments: 44 pages, 8 figures, accepted to SIDMASubjects: Computational Complexity (cs.CC); Data Structures and Algorithms (cs.DS)
MAX NAE-SAT is a natural optimization problem, closely related to its better-known relative MAX SAT. The approximability status of MAX NAE-SAT is almost completely understood if all clauses have the same size $k$, for some $k\ge 2$. We refer to this problem as MAX NAE-$\{k\}$-SAT. For $k=2$, it is essentially the celebrated MAX CUT problem. For $k=3$, it is related to the MAX CUT problem in graphs that can be fractionally covered by triangles. For $k\ge 4$, it is known that an approximation ratio of $1-\frac{1}{2^{k-1}}$, obtained by choosing a random assignment, is optimal, assuming $P\ne NP$. For every $k\ge 2$, an approximation ratio of at least $\frac{7}{8}$ can be obtained for MAX NAE-$\{k\}$-SAT. There was some hope, therefore, that there is also a $\frac{7}{8}$-approximation algorithm for MAX NAE-SAT, where clauses of all sizes are allowed simultaneously.
Our main result is that there is no $\frac{7}{8}$-approximation algorithm for MAX NAE-SAT, assuming the unique games conjecture (UGC). In fact, even for almost satisfiable instances of MAX NAE-$\{3,5\}$-SAT (i.e., MAX NAE-SAT where all clauses have size $3$ or $5$), the best approximation ratio that can be achieved, assuming UGC, is at most $\frac{3(\sqrt{21}-4)}{2}\approx 0.8739$. Using calculus of variations, we extend the analysis of O'Donnell and Wu for MAX CUT to MAX NAE-$\{3\}$-SAT. We obtain an optimal algorithm, assuming UGC, for MAX NAE-$\{3\}$-SAT, slightly improving on previous algorithms. The approximation ratio of the new algorithm is $\approx 0.9089$.
We complement our theoretical results with some experimental results. We describe an approximation algorithm for almost satisfiable instances of MAX NAE-$\{3,5\}$-SAT with a conjectured approximation ratio of 0.8728, and an approximation algorithm for almost satisfiable instances of MAX NAE-SAT with a conjectured approximation ratio of 0.8698. - [5] arXiv:2409.13096 (replaced) [pdf, html, other]
-
Title: Fast decision tree learning solves hard coding-theoretic problemsComments: 31 pages, FOCS 2024Subjects: Computational Complexity (cs.CC); Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG)
We connect the problem of properly PAC learning decision trees to the parameterized Nearest Codeword Problem ($k$-NCP). Despite significant effort by the respective communities, algorithmic progress on both problems has been stuck: the fastest known algorithm for the former runs in quasipolynomial time (Ehrenfeucht and Haussler 1989) and the best known approximation ratio for the latter is $O(n/\log n)$ (Berman and Karpinsky 2002; Alon, Panigrahy, and Yekhanin 2009). Research on both problems has thus far proceeded independently with no known connections.
We show that $\textit{any}$ improvement of Ehrenfeucht and Haussler's algorithm will yield $O(\log n)$-approximation algorithms for $k$-NCP, an exponential improvement of the current state of the art. This can be interpreted either as a new avenue for designing algorithms for $k$-NCP, or as one for establishing the optimality of Ehrenfeucht and Haussler's algorithm. Furthermore, our reduction along with existing inapproximability results for $k$-NCP already rule out polynomial-time algorithms for properly learning decision trees. A notable aspect of our hardness results is that they hold even in the setting of $\textit{weak}$ learning whereas prior ones were limited to the setting of strong learning. - [6] arXiv:2310.19594 (replaced) [pdf, html, other]
-
Title: Superpolynomial smoothed complexity of 3-FLIP in Local Max-CutComments: 19 pages, 3 figures, replaced section 3.1 by a known resultSubjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC)
Local search algorithms for NP-hard problems such as Max-Cut frequently perform much better in practice than worst-case analysis suggests. Smoothed analysis has proved an effective approach to understanding this: a substantial literature shows that when a small amount of random noise is added to input data, local search algorithms typically run in polynomial or quasi-polynomial time. In this paper, we provide the first example where a local search algorithm for the Max-Cut problem fails to be efficient in the framework of smoothed analysis. Specifically, we construct a graph with $n$ vertices where the smoothed runtime of the 3-FLIP algorithm can be as large as $2^{\Omega(\sqrt{n})}$.
Additionally, for the setting without random noise, we give a new construction of graphs where the runtime of the FLIP algorithm is $2^{\Omega(n)}$ for any pivot rule. These graphs are much smaller and have a simpler structure than previous constructions.