Imputation of missing values in multi-view data

van Loon, Wouter; Fokkema, Marjolein; de Vos, Frank; Koini, Marisa; Schmidt, Reinhold; de Rooij, Mark

doi:10.1016/j.inffus.2024.102524

Statistics > Machine Learning

arXiv:2210.14484 (stat)

[Submitted on 26 Oct 2022 (v1), last revised 20 Jun 2024 (this version, v4)]

Title:Imputation of missing values in multi-view data

Authors:Wouter van Loon, Marjolein Fokkema, Frank de Vos, Marisa Koini, Reinhold Schmidt, Mark de Rooij

View PDF HTML (experimental)

Abstract:Data for which a set of objects is described by multiple distinct feature sets (called views) is known as multi-view data. When missing values occur in multi-view data, all features in a view are likely to be missing simultaneously. This may lead to very large quantities of missing data which, especially when combined with high-dimensionality, can make the application of conditional imputation methods computationally infeasible. However, the multi-view structure could be leveraged to reduce the complexity and computational load of imputation. We introduce a new imputation method based on the existing stacked penalized logistic regression (StaPLR) algorithm for multi-view learning. It performs imputation in a dimension-reduced space to address computational challenges inherent to the multi-view context. We compare the performance of the new imputation method with several existing imputation algorithms in simulated data sets and a real data application. The results show that the new imputation method leads to competitive results at a much lower computational cost, and makes the use of advanced imputation algorithms such as missForest and predictive mean matching possible in settings where they would otherwise be computationally infeasible.

Comments:	49 pages, 15 figures. Accepted manuscript
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
Cite as:	arXiv:2210.14484 [stat.ML]
	(or arXiv:2210.14484v4 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2210.14484
Journal reference:	Information Fusion 111 (2024) 102524
Related DOI:	https://doi.org/10.1016/j.inffus.2024.102524

Submission history

From: Wouter van Loon [view email]
[v1] Wed, 26 Oct 2022 05:19:30 UTC (612 KB)
[v2] Tue, 25 Apr 2023 13:04:21 UTC (613 KB)
[v3] Thu, 29 Feb 2024 16:21:33 UTC (741 KB)
[v4] Thu, 20 Jun 2024 12:18:33 UTC (747 KB)

Statistics > Machine Learning

Title:Imputation of missing values in multi-view data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Imputation of missing values in multi-view data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators