Transactional Python for Durable Machine Learning: Vision, Challenges, and Feasibility

Chockchowwat, Supawit; Li, Zhaoheng; Park, Yongjoo

doi:10.1145/3595360.3595855

Computer Science > Databases

arXiv:2305.08770 (cs)

[Submitted on 15 May 2023]

Title:Transactional Python for Durable Machine Learning: Vision, Challenges, and Feasibility

Authors:Supawit Chockchowwat, Zhaoheng Li, Yongjoo Park

View PDF

Abstract:In machine learning (ML), Python serves as a convenient abstraction for working with key libraries such as PyTorch, scikit-learn, and others. Unlike DBMS, however, Python applications may lose important data, such as trained models and extracted features, due to machine failures or human errors, leading to a waste of time and resources. Specifically, they lack four essential properties that could make ML more reliable and user-friendly -- durability, atomicity, replicability, and time-versioning (DART).
This paper presents our vision of Transactional Python that provides DART without any code modifications to user programs or the Python kernel, by non-intrusively monitoring application states at the object level and determining a minimal amount of information sufficient to reconstruct a whole application. Our evaluation of a proof-of-concept implementation with public PyTorch and scikit-learn applications shows that DART can be offered with overheads ranging 1.5%--15.6%.

Comments:	5 pages, 5 figures, to appear at DEEM 2023
Subjects:	Databases (cs.DB); Machine Learning (cs.LG); Programming Languages (cs.PL)
Cite as:	arXiv:2305.08770 [cs.DB]
	(or arXiv:2305.08770v1 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.2305.08770
Related DOI:	https://doi.org/10.1145/3595360.3595855

Submission history

From: Supawit Chockchowwat [view email]
[v1] Mon, 15 May 2023 16:27:09 UTC (1,049 KB)

Computer Science > Databases

Title:Transactional Python for Durable Machine Learning: Vision, Challenges, and Feasibility

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:Transactional Python for Durable Machine Learning: Vision, Challenges, and Feasibility

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators