Quantitative Biology > Genomics
[Submitted on 19 Feb 2022 (v1), last revised 24 Dec 2022 (this version, v2)]
Title:Identifying OCRs in cfDNA WGS Data by Correlation Clustering
View PDFAbstract:In the recent decade, the emergence of liquid biopsy has significantly improved cancer monitoring and detection. Dying cells, including those originating from tumors, shed their DNA into the bloodstream and contribute to a pool of circulating fragments called cell-free DNA (cfDNA). Identifying the tissue origin of these DNA fragments from their epigenetic features has implications in various clinical contexts. Open chromatin regions (OCRs) are important epigenetic features of DNA that reflect cell types of origin. Profiling these features by DNase-seq, ATAC-seq, and histone ChIP-seq provides insights into tissue-specific and disease-specific regulatory mechanisms. Integration of genomic and epigenomic features for cancer detection by liquid biopsy has previously been reported. However, many multimodal analyses require large amounts of cfDNA input and/or multiple types of experiments to cover the genomic and epigenomic aspects of a single sample which is cost and time prohibitive. Thus, methods that capture genomic and epigenomic profiles in a single experiment type with low input requirements are of importance. Predicting OCRs from whole genome sequencing (WGS) data is one such approach. Here, we applied a correlation clustering algorithm to predict OCRs. We used local sequencing depth as input to our algorithm. Multiple processing steps were then applied as follows: count normalization, discrete Fourier transform conversion, graph construction, graph cut optimization by linear programming, and clustering. To validate the proposed method, we compared the output of our predictions (OCR vs. non-OCR) with previously validated open chromatin regions related to human blood samples of the ATAC-db. The percentage of overlap between them is greater than 67%.
Submission history
From: Fahimeh Palizban [view email][v1] Sat, 19 Feb 2022 14:54:18 UTC (536 KB)
[v2] Sat, 24 Dec 2022 21:17:48 UTC (883 KB)
Current browse context:
q-bio.GN
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.