Physics > Chemical Physics
[Submitted on 20 Jun 2024]
Title:CheMFi: A Multifidelity Dataset of Quantum Chemical Properties of Diverse Molecules
View PDF HTML (experimental)Abstract:Progress in both Machine Learning (ML) and conventional Quantum Chemistry (QC) computational methods have resulted in high accuracy ML models for QC properties ranging from atomization energies to excitation energies. Various datasets such as MD17, MD22, and WS22, which consist of properties calculated at some level of QC method, or fidelity, have been generated to benchmark such ML models. The term fidelity refers to the accuracy of the chosen QC method to the actual real value of the property. The higher the fidelity, the more accurate the calculated property, albeit at a higher computational cost.
Research in multifidelity ML (MFML) methods, where ML models are trained on data from more than one numerical QC method, has shown the effectiveness of such models over single fidelity methods. Much research is progressing in this direction for diverse applications ranging from energy band gaps to excitation energies. A major hurdle for effective research in this field of research in the community is the lack of a diverse multifidelity dataset for benchmarking.
Here, we present a comprehensive multifidelity dataset drawn from the WS22 molecular conformations. We provide the quantum Chemistry MultiFidelity (CheMFi) dataset consisting of five fidelities calculated with the TD-DFT formalism. The fidelities differ in their basis set choice and are namely: STO-3G, 3-21G, 6-31G, def2-SVP, and def2-TZVP. CheMFi offers to the community a variety of QC properties including vertical excitation energies, oscillator strengths, molecular dipole moments, and ground state energies. In addition to the dataset, multifidelity benchmarks are set with state-of-the-art MFML and optimized-MFML
Current browse context:
physics.chem-ph
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.