Reducing Coverage Depth in DNA Storage: A Combinatorial Perspective on Random Access Efficiency

Gruica, Anina; Bar-Lev, Daniella; Ravagnani, Alberto; Yaakobi, Eitan

Computer Science > Information Theory

arXiv:2401.15722v1 (cs)

[Submitted on 28 Jan 2024 (this version), latest version 30 Sep 2024 (v2)]

Title:Reducing Coverage Depth in DNA Storage: A Combinatorial Perspective on Random Access Efficiency

Authors:Anina Gruica, Daniella Bar-Lev, Alberto Ravagnani, Eitan Yaakobi

View PDF HTML (experimental)

Abstract:We investigate the fundamental limits of the recently proposed random access coverage depth problem for DNA data storage. Under this paradigm, it is assumed that the user information consists of $k$ information strands, which are encoded into $n$ strands via some generator matrix $G$. In the sequencing process, the strands are read uniformly at random, since each strand is available in a large number of copies. In this context, the random access coverage depth problem refers to the expected number of reads (i.e., sequenced strands) until it is possible to decode a specific information strand, which is requested by the user. The goal is to minimize the maximum expectation over all possible requested information strands, and this value is denoted by $T_{\max}(G)$. This paper introduces new techniques to investigate the random access coverage depth problem, which capture its combinatorial nature. We establish two general formulas to find $T_{max}(G)$ for arbitrary matrices. We introduce the concept of recovery balanced codes and combine all these results and notions to compute $T_{\max}(G)$ for MDS, simplex, and Hamming codes. We also study the performance of modified systematic MDS matrices and our results show that the best results for $T_{\max}(G)$ are achieved with a specific mix of encoded strands and replication of the information strands.

Subjects:	Information Theory (cs.IT); Combinatorics (math.CO)
Cite as:	arXiv:2401.15722 [cs.IT]
	(or arXiv:2401.15722v1 [cs.IT] for this version)
	https://doi.org/10.48550/arXiv.2401.15722

Submission history

From: Anina Gruica [view email]
[v1] Sun, 28 Jan 2024 18:13:19 UTC (783 KB)
[v2] Mon, 30 Sep 2024 11:27:46 UTC (53 KB)

Computer Science > Information Theory

Title:Reducing Coverage Depth in DNA Storage: A Combinatorial Perspective on Random Access Efficiency

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Theory

Title:Reducing Coverage Depth in DNA Storage: A Combinatorial Perspective on Random Access Efficiency

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators