Boosted generalized normal distributions: Integrating machine learning with operations knowledge

Gurlek, Ragip; de Vericourt, Francis; Lee, Donald K. K.

Computer Science > Machine Learning

arXiv:2407.19092 (cs)

[Submitted on 26 Jul 2024 (v1), last revised 1 Aug 2024 (this version, v2)]

Title:Boosted generalized normal distributions: Integrating machine learning with operations knowledge

Authors:Ragip Gurlek, Francis de Vericourt, Donald K. K. Lee

View PDF HTML (experimental)

Abstract:Applications of machine learning (ML) techniques to operational settings often face two challenges: i) ML methods mostly provide point predictions whereas many operational problems require distributional information; and ii) They typically do not incorporate the extensive body of knowledge in the operations literature, particularly the theoretical and empirical findings that characterize specific distributions. We introduce a novel and rigorous methodology, the Boosted Generalized Normal Distribution ($b$GND), to address these challenges. The Generalized Normal Distribution (GND) encompasses a wide range of parametric distributions commonly encountered in operations, and $b$GND leverages gradient boosting with tree learners to flexibly estimate the parameters of the GND as functions of covariates. We establish $b$GND's statistical consistency, thereby extending this key property to special cases studied in the ML literature that lacked such guarantees. Using data from a large academic emergency department in the United States, we show that the distributional forecasting of patient wait and service times can be meaningfully improved by leveraging findings from the healthcare operations literature. Specifically, $b$GND performs 6% and 9% better than the distribution-agnostic ML benchmark used to forecast wait and service times respectively. Further analysis suggests that these improvements translate into a 9% increase in patient satisfaction and a 4% reduction in mortality for myocardial infarction patients. Our work underscores the importance of integrating ML with operations knowledge to enhance distributional forecasts.

Comments:	28 pages, 3 figures
Subjects:	Machine Learning (cs.LG); Methodology (stat.ME); Machine Learning (stat.ML)
MSC classes:	60E05, 62G07, 62F99, 68T01, 90B22, 90B50
Cite as:	arXiv:2407.19092 [cs.LG]
	(or arXiv:2407.19092v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2407.19092

Submission history

From: Donald Lee [view email]
[v1] Fri, 26 Jul 2024 21:18:26 UTC (287 KB)
[v2] Thu, 1 Aug 2024 23:12:50 UTC (287 KB)

Computer Science > Machine Learning

Title:Boosted generalized normal distributions: Integrating machine learning with operations knowledge

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Boosted generalized normal distributions: Integrating machine learning with operations knowledge

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators