Optimizing Ensemble Weights and Hyperparameters of Machine Learning Models for Regression Problems

Shahhosseini, Mohsen; Hu, Guiping; Pham, Hieu

Statistics > Machine Learning

arXiv:1908.05287v4 (stat)

[Submitted on 14 Aug 2019 (v1), revised 11 Sep 2019 (this version, v4), latest version 31 Oct 2020 (v6)]

Title:Optimizing Ensemble Weights and Hyperparameters of Machine Learning Models for Regression Problems

Authors:Mohsen Shahhosseini, Guiping Hu, Hieu Pham

View PDF

Abstract:Aggregating multiple learners through an ensemble of models aims to make better predictions by capturing the underlying distribution more accurately. Different ensembling methods, such as bagging, boosting and stacking/blending, have been studied and adopted extensively in research and practice. While bagging and boosting intend to reduce variance and bias, respectively, blending approaches target both by finding the optimal way to combine base learners to find the best trade-off between bias and variance. In blending, ensembles are created from weighted averages of multiple base learners. In this study, a systematic approach is proposed to find the optimal weights to create these ensembles for bias-variance tradeoff using cross-validation for regression problems (Cross-validated Optimal Weighted Ensemble (COWE)). Furthermore, it is known that tuning hyperparameters of each base learner inside the ensemble weight optimization process can produce better performing ensembles. To this end, a nested algorithm based on bi-level optimization that considers tuning hyperparameters as well as finding the optimal weights to combine ensembles (Cross-validated Optimal Weighted Ensemble with Internally Tuned Hyperparameters (COWE-ITH)) was proposed. The algorithm is shown to be generalizable to real data sets though analyses with ten publicly available data sets. The prediction accuracies of COWE-ITH and COWE have been compared to base learners and the state-of-art ensemble methods. The results show that COWE-ITH outperforms other benchmarks as well as base learners in 9 out of 10 data sets.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
Cite as:	arXiv:1908.05287 [stat.ML]
	(or arXiv:1908.05287v4 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1908.05287

Submission history

From: Mohsen Shahhosseini [view email]
[v1] Wed, 14 Aug 2019 18:01:02 UTC (461 KB)
[v2] Tue, 20 Aug 2019 18:10:57 UTC (462 KB)
[v3] Thu, 29 Aug 2019 17:50:15 UTC (462 KB)
[v4] Wed, 11 Sep 2019 22:50:29 UTC (467 KB)
[v5] Sun, 19 Jan 2020 20:26:46 UTC (531 KB)
[v6] Sat, 31 Oct 2020 20:28:35 UTC (548 KB)

Statistics > Machine Learning

Title:Optimizing Ensemble Weights and Hyperparameters of Machine Learning Models for Regression Problems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Optimizing Ensemble Weights and Hyperparameters of Machine Learning Models for Regression Problems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators