Condensed Matter > Materials Science
[Submitted on 3 Aug 2023 (v1), last revised 11 Dec 2023 (this version, v2)]
Title:Accelerated Organic Crystal Structure Prediction with Genetic Algorithms and Machine Learning
View PDF HTML (experimental)Abstract:We present a high-throughput, end-to-end pipeline for organic crystal structure prediction (CSP) -- the problem of identifying the stable crystal structures that will form from a given molecule based only on its molecular composition. Our tool uses Neural Network Potentials (NNPs) to allow for efficient screening and structural relaxations of generated crystal candidates. Our pipeline consists of two distinct stages -- random search, whereby crystal candidates are randomly generated and screened, and optimization, where a genetic algorithm (GA) optimizes this screened population. We assess the performance of each stage of our pipeline on 21 molecules taken from the Cambridge Crystallographic Data Centre's CSP blind tests. We show that random search alone yields matches for $\approx 50\%$ of targets. We then validate the potential of our full pipeline, making use of the GA to optimize the Root Mean-Squared Deviation (RMSD) between crystal candidates and the experimentally derived structure. With this approach, we are able to find matches for $\approx80\%$ of candidates with 10-100 times smaller initial population sizes than when using random search. Lastly, we run our full pipeline with an ANI model that is trained on a small dataset of molecules extracted from crystal structures in the Cambridge Structural Database, generating $\approx 60\%$ of targets. By leveraging ML models trained to predict energies at the DFT level, our pipeline has the potential to approach the accuracy of \emph{ab initio} methods and the efficiency of empirical force-fields.
Submission history
From: Kevin Ryczko [view email][v1] Thu, 3 Aug 2023 19:12:00 UTC (7,407 KB)
[v2] Mon, 11 Dec 2023 15:57:45 UTC (8,176 KB)
Current browse context:
cond-mat.mtrl-sci
Change to browse by:
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender
(What is IArxiv?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.