AlloyBERT: Alloy Property Prediction with Large Language Models

Chaudhari, Akshat; Guntuboina, Chakradhar; Huang, Hongshuo; Farimani, Amir Barati

Condensed Matter > Materials Science

arXiv:2403.19783 (cond-mat)

[Submitted on 28 Mar 2024]

Title:AlloyBERT: Alloy Property Prediction with Large Language Models

Authors:Akshat Chaudhari, Chakradhar Guntuboina, Hongshuo Huang, Amir Barati Farimani

View PDF HTML (experimental)

Abstract:The pursuit of novel alloys tailored to specific requirements poses significant challenges for researchers in the field. This underscores the importance of developing predictive techniques for essential physical properties of alloys based on their chemical composition and processing parameters. This study introduces AlloyBERT, a transformer encoder-based model designed to predict properties such as elastic modulus and yield strength of alloys using textual inputs. Leveraging the pre-trained RoBERTa encoder model as its foundation, AlloyBERT employs self-attention mechanisms to establish meaningful relationships between words, enabling it to interpret human-readable input and predict target alloy properties. By combining a tokenizer trained on our textual data and a RoBERTa encoder pre-trained and fine-tuned for this specific task, we achieved a mean squared error (MSE) of 0.00015 on the Multi Principal Elemental Alloys (MPEA) data set and 0.00611 on the Refractory Alloy Yield Strength (RAYS) dataset. This surpasses the performance of shallow models, which achieved a best-case MSE of 0.00025 and 0.0076 on the MPEA and RAYS datasets respectively. Our results highlight the potential of language models in material science and establish a foundational framework for text-based prediction of alloy properties that does not rely on complex underlying representations, calculations, or simulations.

Comments:	20 pages, 3 figures
Subjects:	Materials Science (cond-mat.mtrl-sci); Machine Learning (cs.LG)
Cite as:	arXiv:2403.19783 [cond-mat.mtrl-sci]
	(or arXiv:2403.19783v1 [cond-mat.mtrl-sci] for this version)
	https://doi.org/10.48550/arXiv.2403.19783

Submission history

From: Chakradhar Guntuboina [view email]
[v1] Thu, 28 Mar 2024 19:09:46 UTC (802 KB)

Condensed Matter > Materials Science

Title:AlloyBERT: Alloy Property Prediction with Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Condensed Matter > Materials Science

Title:AlloyBERT: Alloy Property Prediction with Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators