Asynchronous Parallel Sampling Gradient Boosting Decision Tree

Daning, Cheng; Fen, Xia; Shigang, Li; Yunquan, Zhang

Computer Science > Machine Learning

arXiv:1804.04659v1 (cs)

[Submitted on 12 Apr 2018 (this version), latest version 18 Jul 2019 (v4)]

Title:Asynchronous Parallel Sampling Gradient Boosting Decision Tree

Authors:Cheng Daning, Xia Fen, Li Shigang, Zhang Yunquan

View PDF

Abstract:With the development of big data technology, Gradient Boosting Decision Tree, i.e. GBDT, becomes one of the most important machine learning algorithms for its accurate output. However, the training process of GBDT needs a lot of computational resources and time. In order to accelerate the training process of GBDT, the asynchronous parallel sampling gradient boosting decision tree, abbr. asynch-SGBDT is proposed in this paper. Via introducing sampling, we adapt the numerical optimization process of traditional GBDT training process into stochastic optimization process and use asynchronous parallel stochastic gradient descent to accelerate the GBDT training process. Meanwhile, the theoretical analysis of asynch-SGBDT is provided by us in this paper. Experimental results show that GBDT training process could be accelerated by asynch-SGBDT. Our asynchronous parallel strategy achieves an almost linear speedup, especially for high-dimensional sparse datasets.

Subjects:	Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (stat.ML)
Cite as:	arXiv:1804.04659 [cs.LG]
	(or arXiv:1804.04659v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1804.04659

Submission history

From: Daning Cheng [view email]
[v1] Thu, 12 Apr 2018 14:06:05 UTC (862 KB)
[v2] Fri, 18 May 2018 04:26:26 UTC (814 KB)
[v3] Fri, 17 Aug 2018 01:57:44 UTC (813 KB)
[v4] Thu, 18 Jul 2019 06:50:05 UTC (873 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2018-04

Change to browse by:

cs
cs.DC
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Daning Cheng
Fen Xia
Shigang Li
Yunquan Zhang

export BibTeX citation

Computer Science > Machine Learning

Title:Asynchronous Parallel Sampling Gradient Boosting Decision Tree

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Asynchronous Parallel Sampling Gradient Boosting Decision Tree

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators