Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization

Masserano, Luca; Ansari, Abdul Fatir; Han, Boran; Zhang, Xiyuan; Faloutsos, Christos; Mahoney, Michael W.; Wilson, Andrew Gordon; Park, Youngsuk; Rangapuram, Syama; Maddix, Danielle C.; Wang, Yuyang

Computer Science > Machine Learning

arXiv:2412.05244 (cs)

[Submitted on 6 Dec 2024]

Title:Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization

Authors:Luca Masserano, Abdul Fatir Ansari, Boran Han, Xiyuan Zhang, Christos Faloutsos, Michael W. Mahoney, Andrew Gordon Wilson, Youngsuk Park, Syama Rangapuram, Danielle C. Maddix, Yuyang Wang

View PDF

Abstract:How to best develop foundational models for time series forecasting remains an important open question. Tokenization is a crucial consideration in this effort: what is an effective discrete vocabulary for a real-valued sequential input? To address this question, we develop WaveToken, a wavelet-based tokenizer that allows models to learn complex representations directly in the space of time-localized frequencies. Our method first scales and decomposes the input time series, then thresholds and quantizes the wavelet coefficients, and finally pre-trains an autoregressive model to forecast coefficients for the forecast horizon. By decomposing coarse and fine structures in the inputs, wavelets provide an eloquent and compact language for time series forecasting that simplifies learning. Empirical results on a comprehensive benchmark, including 42 datasets for both in-domain and zero-shot settings, show that WaveToken: i) provides better accuracy than recently proposed foundation models for forecasting while using a much smaller vocabulary (1024 tokens), and performs on par or better than modern deep learning models trained specifically on each dataset; and ii) exhibits superior generalization capabilities, achieving the best average rank across all datasets for three complementary metrics. In addition, we show that our method can easily capture complex temporal patterns of practical relevance that are challenging for other recent pre-trained models, including trends, sparse spikes, and non-stationary time series with varying frequencies evolving over time.

Comments:	25 pages, 15 figures
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2412.05244 [cs.LG]
	(or arXiv:2412.05244v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2412.05244

Submission history

From: Luca Masserano [view email]
[v1] Fri, 6 Dec 2024 18:22:59 UTC (21,381 KB)

Computer Science > Machine Learning

Title:Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators