Learning quantitative sequence-function relationships from high-throughput biological data

Atwal, Gurinder S.; Kinney, Justin B.

Quantitative Biology > Quantitative Methods

arXiv:1506.00054v1 (q-bio)

[Submitted on 30 May 2015 (this version), latest version 22 Sep 2015 (v2)]

Title:Learning quantitative sequence-function relationships from high-throughput biological data

Authors:Gurinder S. Atwal, Justin B. Kinney

View PDF

Abstract:Understanding the transcriptional regulatory code, as well as other types of information encoded within biomolecular sequences, will require learning biophysical models of sequence-function relationships from high-throughput data. Controlling and characterizing the noise in such experiments, however, is notoriously difficult. The unpredictability of such noise creates problems for standard likelihood-based methods in statistical learning, which require that the quantitative form of experimental noise be known precisely. However, when this unpredictability is properly accounted for, important theoretical aspects of statistical learning which remain hidden in standard treatments are revealed. Specifically, one finds a close relationship between the standard inference method, based on likelihood, and an alternative inference method based on mutual information. Here we review and extend this relationship. We also describe its implications for learning sequence-function relationships from real biological data. Finally, we detail an idealized experiment in which these results can be demonstrated analytically.

Comments:	21 pages, 7 figures. Submitted for publication
Subjects:	Quantitative Methods (q-bio.QM); Statistics Theory (math.ST); Biological Physics (physics.bio-ph); Data Analysis, Statistics and Probability (physics.data-an); Machine Learning (stat.ML)
Cite as:	arXiv:1506.00054 [q-bio.QM]
	(or arXiv:1506.00054v1 [q-bio.QM] for this version)
	https://doi.org/10.48550/arXiv.1506.00054

Submission history

From: Justin Kinney [view email]
[v1] Sat, 30 May 2015 01:20:59 UTC (3,295 KB)
[v2] Tue, 22 Sep 2015 15:47:20 UTC (2,300 KB)

Quantitative Biology > Quantitative Methods

Title:Learning quantitative sequence-function relationships from high-throughput biological data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Quantitative Biology > Quantitative Methods

Title:Learning quantitative sequence-function relationships from high-throughput biological data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators