Hardness of Learning Halfspaces with Massart Noise

Diakonikolas, Ilias; Kane, Daniel M.

Computer Science > Machine Learning

arXiv:2012.09720v2 (cs)

[Submitted on 17 Dec 2020 (v1), revised 23 Aug 2021 (this version, v2), latest version 8 Nov 2021 (v3)]

Title:Hardness of Learning Halfspaces with Massart Noise

Authors:Ilias Diakonikolas, Daniel M. Kane

View PDF

Abstract:We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise. Specifically, given labeled examples $(x, y)$ from a distribution $D$ on $\mathbb{R}^{n} \times \{ \pm 1\}$ such that the marginal distribution on $x$ is arbitrary and the labels are generated by an unknown halfspace corrupted with Massart noise at rate $\eta<1/2$, we want to compute a hypothesis with small misclassification error. Characterizing the efficient learnability of halfspaces in the Massart model has remained a longstanding open problem in learning theory.
Recent work gave a polynomial-time learning algorithm for this problem with error $\eta+\epsilon$. This error upper bound can be far from the information-theoretically optimal bound of $\mathrm{OPT}+\epsilon$. More recent work showed that {\em exact learning}, i.e., achieving error $\mathrm{OPT}+\epsilon$, is hard in the Statistical Query (SQ) model. In this work, we show that there is an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a polynomial-time SQ algorithm. In particular, our lower bound implies that no efficient SQ algorithm can approximate the optimal error within any polynomial factor.

Comments:	Slightly revised presentation
Subjects:	Machine Learning (cs.LG); Computational Complexity (cs.CC); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:2012.09720 [cs.LG]
	(or arXiv:2012.09720v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2012.09720

Submission history

From: Ilias Diakonikolas [view email]
[v1] Thu, 17 Dec 2020 16:43:11 UTC (24 KB)
[v2] Mon, 23 Aug 2021 16:18:45 UTC (143 KB)
[v3] Mon, 8 Nov 2021 18:19:54 UTC (31 KB)

Computer Science > Machine Learning

Title:Hardness of Learning Halfspaces with Massart Noise

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Hardness of Learning Halfspaces with Massart Noise

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators