Quantitative Biology > Genomics
[Submitted on 9 Dec 2021]
Title:Classification of genetic variants using machine learning
View PDFAbstract:Recent advances in genomic sequencing technology have resulted in an abundance of genome sequence data. Despite the progress in interpreting those data, there remains a broad scope for their translation into clinical and societal benefits. Loss-of-function variations in the human genome can be causal in disease development. Precise identification of such variations and pathogenicity prediction may lead to better drug targeting, among other benefits. Machine learning comes across as a promising method for its proven predictive ability. We have curated a novel dataset for the classification of LOF variants using high-quality databases of genetic variation. We trained and validated seven different classification algorithms using the new dataset to classify the variants as Benign, Pathogenic and Likely pathogenic. We recorded the best overall performance using the XG-Boost algorithm with an F1-score of 0.88 on the test set. We observed fair performance on Pathogenic samples with high recall and moderate precision and subpar performance on Likely pathogenic class, albeit with moderate precision. Overall, the encouraging results make our final model a promising candidate for further real-world tests.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.