Scalable Probabilistic Forecasting in Retail with Gradient Boosted Trees: A Practitioner's Approach

Long, Xueying; Bui, Quang; Oktavian, Grady; Schmidt, Daniel F.; Bergmeir, Christoph; Godahewa, Rakshitha; Lee, Seong Per; Zhao, Kaifeng; Condylis, Paul

Computer Science > Machine Learning

arXiv:2311.00993 (cs)

[Submitted on 2 Nov 2023]

Title:Scalable Probabilistic Forecasting in Retail with Gradient Boosted Trees: A Practitioner's Approach

Authors:Xueying Long, Quang Bui, Grady Oktavian, Daniel F. Schmidt, Christoph Bergmeir, Rakshitha Godahewa, Seong Per Lee, Kaifeng Zhao, Paul Condylis

View PDF

Abstract:The recent M5 competition has advanced the state-of-the-art in retail forecasting. However, we notice important differences between the competition challenge and the challenges we face in a large e-commerce company. The datasets in our scenario are larger (hundreds of thousands of time series), and e-commerce can afford to have a larger assortment than brick-and-mortar retailers, leading to more intermittent data. To scale to larger dataset sizes with feasible computational effort, firstly, we investigate a two-layer hierarchy and propose a top-down approach to forecasting at an aggregated level with less amount of series and intermittency, and then disaggregating to obtain the decision-level forecasts. Probabilistic forecasts are generated under distributional assumptions. Secondly, direct training at the lower level with subsamples can also be an alternative way of scaling. Performance of modelling with subsets is evaluated with the main dataset. Apart from a proprietary dataset, the proposed scalable methods are evaluated using the Favorita dataset and the M5 dataset. We are able to show the differences in characteristics of the e-commerce and brick-and-mortar retail datasets. Notably, our top-down forecasting framework enters the top 50 of the original M5 competition, even with models trained at a higher level under a much simpler setting.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2311.00993 [cs.LG]
	(or arXiv:2311.00993v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2311.00993

Submission history

From: Xueying Long [view email]
[v1] Thu, 2 Nov 2023 04:46:32 UTC (407 KB)

Computer Science > Machine Learning

Title:Scalable Probabilistic Forecasting in Retail with Gradient Boosted Trees: A Practitioner's Approach

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Scalable Probabilistic Forecasting in Retail with Gradient Boosted Trees: A Practitioner's Approach

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators