Shifts 2.0: Extending The Dataset of Real Distributional Shifts

Malinin, Andrey; Athanasopoulos, Andreas; Barakovic, Muhamed; Cuadra, Meritxell Bach; Gales, Mark J. F.; Granziera, Cristina; Graziani, Mara; Kartashev, Nikolay; Kyriakopoulos, Konstantinos; Lu, Po-Jui; Molchanova, Nataliia; Nikitakis, Antonis; Raina, Vatsal; La Rosa, Francesco; Sivena, Eli; Tsarsitalidis, Vasileios; Tsompopoulou, Efi; Volf, Elena

Computer Science > Machine Learning

arXiv:2206.15407 (cs)

[Submitted on 30 Jun 2022 (v1), last revised 15 Sep 2022 (this version, v2)]

Title:Shifts 2.0: Extending The Dataset of Real Distributional Shifts

Authors:Andrey Malinin, Andreas Athanasopoulos, Muhamed Barakovic, Meritxell Bach Cuadra, Mark J. F. Gales, Cristina Granziera, Mara Graziani, Nikolay Kartashev, Konstantinos Kyriakopoulos, Po-Jui Lu, Nataliia Molchanova, Antonis Nikitakis, Vatsal Raina, Francesco La Rosa, Eli Sivena, Vasileios Tsarsitalidis, Efi Tsompopoulou, Elena Volf

View PDF

Abstract:Distributional shift, or the mismatch between training and deployment data, is a significant obstacle to the usage of machine learning in high-stakes industrial applications, such as autonomous driving and medicine. This creates a need to be able to assess how robustly ML models generalize as well as the quality of their uncertainty estimates. Standard ML baseline datasets do not allow these properties to be assessed, as the training, validation and test data are often identically distributed. Recently, a range of dedicated benchmarks have appeared, featuring both distributionally matched and shifted data. Among these benchmarks, the Shifts dataset stands out in terms of the diversity of tasks as well as the data modalities it features. While most of the benchmarks are heavily dominated by 2D image classification tasks, Shifts contains tabular weather forecasting, machine translation, and vehicle motion prediction tasks. This enables the robustness properties of models to be assessed on a diverse set of industrial-scale tasks and either universal or directly applicable task-specific conclusions to be reached. In this paper, we extend the Shifts Dataset with two datasets sourced from industrial, high-risk applications of high societal importance. Specifically, we consider the tasks of segmentation of white matter Multiple Sclerosis lesions in 3D magnetic resonance brain images and the estimation of power consumption in marine cargo vessels. Both tasks feature ubiquitous distributional shifts and a strict safety requirement due to the high cost of errors. These new datasets will allow researchers to further explore robust generalization and uncertainty estimation in new situations. In this work, we provide a description of the dataset and baseline results for both tasks.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2206.15407 [cs.LG]
	(or arXiv:2206.15407v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.15407

Submission history

From: Andrey Malinin Dr. [view email]
[v1] Thu, 30 Jun 2022 16:51:52 UTC (4,126 KB)
[v2] Thu, 15 Sep 2022 09:52:12 UTC (4,352 KB)

Computer Science > Machine Learning

Title:Shifts 2.0: Extending The Dataset of Real Distributional Shifts

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Shifts 2.0: Extending The Dataset of Real Distributional Shifts

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators