Fast Learning from Sparse Data

Chickering, David Maxwell; Heckerman, David

Computer Science > Machine Learning

arXiv:1301.6685 (cs)

[Submitted on 23 Jan 2013 (v1), last revised 16 May 2015 (this version, v2)]

Title:Fast Learning from Sparse Data

Authors:David Maxwell Chickering, David Heckerman

View PDF

Abstract:We describe two techniques that significantly improve the running time of several standard machine-learning algorithms when data is sparse. The first technique is an algorithm that effeciently extracts one-way and two-way counts--either real or expected-- from discrete data. Extracting such counts is a fundamental step in learning algorithms for constructing a variety of models including decision trees, decision graphs, Bayesian networks, and naive-Bayes clustering models. The second technique is an algorithm that efficiently performs the E-step of the EM algorithm (i.e. inference) when applied to a naive-Bayes clustering model. Using real-world data sets, we demonstrate a dramatic decrease in running time for algorithms that incorporate these techniques.

Comments:	Appears in Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI1999)
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Report number:	UAI-P-1999-PG-109-115
Cite as:	arXiv:1301.6685 [cs.LG]
	(or arXiv:1301.6685v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1301.6685

Submission history

From: Max Chickering [view email] [via Martijn de Jongh as proxy]
[v1] Wed, 23 Jan 2013 15:57:18 UTC (247 KB)
[v2] Sat, 16 May 2015 23:09:53 UTC (118 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2013-01

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

David Maxwell Chickering
David Heckerman

export BibTeX citation

Computer Science > Machine Learning

Title:Fast Learning from Sparse Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Fast Learning from Sparse Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators