Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee

Chen, George H.

Computer Science > Machine Learning

arXiv:2206.10477 (cs)

[Submitted on 21 Jun 2022 (v1), last revised 16 Feb 2025 (this version, v6)]

Title:Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee

Authors:George H. Chen

View PDF HTML (experimental)

Abstract:Kernel survival analysis models estimate individual survival distributions with the help of a kernel function, which measures the similarity between any two data points. Such a kernel function can be learned using deep kernel survival models. In this paper, we present a new deep kernel survival model called a survival kernet, which scales to large datasets in a manner that is amenable to model interpretation and also theoretical analysis. Specifically, the training data are partitioned into clusters based on a recently developed training set compression scheme for classification and regression called kernel netting that we extend to the survival analysis setting. At test time, each data point is represented as a weighted combination of these clusters, and each such cluster can be visualized. For a special case of survival kernets, we establish a finite-sample error bound on predicted survival distributions that is, up to a log factor, optimal. Whereas scalability at test time is achieved using the aforementioned kernel netting compression strategy, scalability during training is achieved by a warm-start procedure based on tree ensembles such as XGBoost and a heuristic approach to accelerating neural architecture search. On four standard survival analysis datasets of varying sizes (up to roughly 3 million data points), we show that survival kernets are highly competitive compared to various baselines tested in terms of time-dependent concordance index. Our code is available at: this https URL

Comments:	Journal of Machine Learning Research (JMLR 2024); this draft includes minor corrections over the original JMLR draft (main change: previously, the TUNA warm-start in Section 4 was inaccurately stated as to include an extra step of running a method by Chen (2020) but the final version of our code did not use this extra step--we updated Sections 4 and 5.2 and Appendices B and C.3 to match our code)
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2206.10477 [cs.LG]
	(or arXiv:2206.10477v6 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.10477

Submission history

From: George Chen [view email]
[v1] Tue, 21 Jun 2022 15:42:48 UTC (3,370 KB)
[v2] Mon, 27 Jun 2022 20:12:19 UTC (3,370 KB)
[v3] Thu, 30 Jun 2022 20:52:46 UTC (3,500 KB)
[v4] Sun, 9 Jul 2023 05:37:52 UTC (959 KB)
[v5] Mon, 19 Feb 2024 23:36:40 UTC (973 KB)
[v6] Sun, 16 Feb 2025 00:24:54 UTC (972 KB)

Computer Science > Machine Learning

Title:Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators