Predicting sub-population specific viral evolution

Shi, Wenxian; Wu, Menghua; Barzilay, Regina

Computer Science > Machine Learning

arXiv:2410.21518 (cs)

COVID-19 e-print

Important: e-prints posted on arXiv are not peer-reviewed by arXiv; they should not be relied upon without context to guide clinical practice or health-related behavior and should not be reported in news media as established information without consulting multiple experts in the field.

[Submitted on 28 Oct 2024 (v1), last revised 23 Apr 2025 (this version, v2)]

Title:Predicting sub-population specific viral evolution

Authors:Wenxian Shi, Menghua Wu, Regina Barzilay

View PDF HTML (experimental)

Abstract:Forecasting the change in the distribution of viral variants is crucial for therapeutic design and disease surveillance. This task poses significant modeling challenges due to the sharp differences in virus distributions across sub-populations (e.g., countries) and their dynamic interactions. Existing machine learning approaches that model the variant distribution as a whole are incapable of making location-specific predictions and ignore transmissions that shape the viral landscape. In this paper, we propose a sub-population specific protein evolution model, which predicts the time-resolved distributions of viral proteins in different locations. The algorithm explicitly models the transmission rates between sub-populations and learns their interdependence from data. The change in protein distributions across all sub-populations is defined through a linear ordinary differential equation (ODE) parametrized by transmission rates. Solving this ODE yields the likelihood of a given protein occurring in particular sub-populations. Multi-year evaluation on both SARS-CoV-2 and influenza A/H3N2 demonstrates that our model outperforms baselines in accurately predicting distributions of viral proteins across continents and countries. We also find that the transmission rates learned from data are consistent with the transmission pathways discovered by retrospective phylogenetic analysis.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2410.21518 [cs.LG]
	(or arXiv:2410.21518v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2410.21518

Submission history

From: Wenxian Shi [view email]
[v1] Mon, 28 Oct 2024 20:39:37 UTC (2,300 KB)
[v2] Wed, 23 Apr 2025 15:32:41 UTC (1,974 KB)

Computer Science > Machine Learning

Title:Predicting sub-population specific viral evolution

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Predicting sub-population specific viral evolution

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators