PaEBack: Pareto-Efficient Backsubsampling for Time Series Data

Zhang, Xinyu; Ghosh, Sujit

Statistics > Applications

arXiv:2210.15780 (stat)

[Submitted on 27 Oct 2022 (v1), last revised 29 Sep 2023 (this version, v2)]

Title:PaEBack: Pareto-Efficient Backsubsampling for Time Series Data

Authors:Xinyu Zhang, Sujit Ghosh

View PDF

Abstract:Time series forecasting has been a quintessential topic in data science, but traditionally, forecasting models have relied on extensive historical data. In this paper, we address a practical question: How much recent historical data is required to attain a targeted percentage of statistical prediction efficiency compared to the full time series? We propose the Pareto-Efficient Backsubsampling (PaEBack) method to estimate the percentage of the most recent data needed to achieve the desired level of prediction accuracy. We provide a theoretical justification based on asymptotic prediction theory for the AutoRegressive (AR) models. In particular, through several numerical illustrations, we show the application of the PaEBack for some recently developed machine learning forecasting methods even when the models might be misspecified. The main conclusion is that only a fraction of the most recent historical data provides near-optimal or even better relative predictive accuracy for a broad class of forecasting methods.

Subjects:	Applications (stat.AP); Methodology (stat.ME)
Cite as:	arXiv:2210.15780 [stat.AP]
	(or arXiv:2210.15780v2 [stat.AP] for this version)
	https://doi.org/10.48550/arXiv.2210.15780

Submission history

From: Xinyu Zhang [view email]
[v1] Thu, 27 Oct 2022 21:44:25 UTC (2,128 KB)
[v2] Fri, 29 Sep 2023 08:24:31 UTC (2,086 KB)

Statistics > Applications

Title:PaEBack: Pareto-Efficient Backsubsampling for Time Series Data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Applications

Title:PaEBack: Pareto-Efficient Backsubsampling for Time Series Data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators