An information criterion for automatic gradient tree boosting

Lunde, Berent Ånund Strømnes; Kleppe, Tore Selland; Skaug, Hans Julius

Statistics > Methodology

arXiv:2008.05926 (stat)

[Submitted on 13 Aug 2020]

Title:An information criterion for automatic gradient tree boosting

Authors:Berent Ånund Strømnes Lunde, Tore Selland Kleppe, Hans Julius Skaug

View PDF

Abstract:An information theoretic approach to learning the complexity of classification and regression trees and the number of trees in gradient tree boosting is proposed. The optimism (test loss minus training loss) of the greedy leaf splitting procedure is shown to be the maximum of a Cox-Ingersoll-Ross process, from which a generalization-error based information criterion is formed. The proposed procedure allows fast local model selection without cross validation based hyper parameter tuning, and hence efficient and automatic comparison among the large number of models performed during each boosting iteration. Relative to xgboost, speedups on numerical experiments ranges from around 10 to about 1400, at similar predictive-power measured in terms of test-loss.

Subjects:	Methodology (stat.ME); Computation (stat.CO); Machine Learning (stat.ML)
Cite as:	arXiv:2008.05926 [stat.ME]
	(or arXiv:2008.05926v1 [stat.ME] for this version)
	https://doi.org/10.48550/arXiv.2008.05926

Submission history

From: Berent Ånund Strømnes Lunde [view email]
[v1] Thu, 13 Aug 2020 14:24:27 UTC (394 KB)

Full-text links:

Access Paper:

view license

Current browse context:

stat.ME

< prev | next >

new | recent | 2020-08

Change to browse by:

stat
stat.CO
stat.ML

References & Citations

export BibTeX citation

Statistics > Methodology

Title:An information criterion for automatic gradient tree boosting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Methodology

Title:An information criterion for automatic gradient tree boosting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators