AI Discovering a Coordinate System of Chemical Elements: Dual Representation by Variational Autoencoders

Glushkovsky, Alex

Computer Science > Machine Learning

arXiv:2011.12090 (cs)

[Submitted on 24 Nov 2020 (v1), last revised 22 Jan 2025 (this version, v5)]

Title:AI Discovering a Coordinate System of Chemical Elements: Dual Representation by Variational Autoencoders

Authors:Alex Glushkovsky

View PDF

Abstract:The periodic table is a fundamental representation of chemical elements that plays essential theoretical and practical roles. The research article discusses the experiences of unsupervised training of neural networks to represent elements on the 2D latent space based on their electron configurations. To emphasize chemical properties of the elements, the original data of electron configurations has been realigned towards valence orbitals. Recognizing seven shells and four subshells, the input data has been arranged as 7x4 images. Latent space representation has been performed using a convolutional beta variational autoencoder (beta-VAE). Despite discrete and sparse input data, the beta-VAE disentangles elements of different periods, blocks, groups, and types. The unsupervised representation of elements on the latent space reveals pairwise symmetries of periods and elements related to the invariance of quantum numbers of corresponding elements. In addition, it isolates outliers that turned out to be known cases of Madelung's rule violations for lanthanide and actinide elements. Considering the generative capabilities of beta-VAE, the supervised machine learning has been set to find out if there are insightful patterns distinguishing electron configurations between real elements and decoded artificial ones. Also, the article addresses the capability of dual representation by autoencoders. Conventionally, autoencoders represent observations of input data on the latent space. By transposing and duplicating original input data, it is possible to represent variables on the latent space which can lead to the discovery of meaningful patterns among input variables. Applying that unsupervised learning for transposed data of electron configurations, the order of input variables that has been arranged by the encoder on the latent space has turned out to exactly match the sequence of Madelung's rule.

Comments:	18 pages, 15 figures, 5 tables
Subjects:	Machine Learning (cs.LG); Chemical Physics (physics.chem-ph)
MSC classes:	68T30
Cite as:	arXiv:2011.12090 [cs.LG]
	(or arXiv:2011.12090v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2011.12090

Submission history

From: Alex Glushkovsky [view email]
[v1] Tue, 24 Nov 2020 13:54:23 UTC (1,120 KB)
[v2] Thu, 26 Nov 2020 00:24:44 UTC (1,125 KB)
[v3] Sun, 13 Dec 2020 19:45:56 UTC (1,303 KB)
[v4] Tue, 14 Dec 2021 18:03:43 UTC (1,302 KB)
[v5] Wed, 22 Jan 2025 23:10:20 UTC (1,233 KB)

Computer Science > Machine Learning

Title:AI Discovering a Coordinate System of Chemical Elements: Dual Representation by Variational Autoencoders

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:AI Discovering a Coordinate System of Chemical Elements: Dual Representation by Variational Autoencoders

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators