SynAsk: Unleashing the Power of Large Language Models in Organic Synthesis

Zhang, Chonghuan; Lin, Qianghua; Zhu, Biwei; Yang, Haopeng; Lian, Xiao; Deng, Hao; Zheng, Jiajun; Liao, Kuangbiao

Physics > Chemical Physics

arXiv:2406.04593 (physics)

[Submitted on 7 Jun 2024 (v1), last revised 14 Jun 2024 (this version, v2)]

Title:SynAsk: Unleashing the Power of Large Language Models in Organic Synthesis

Authors:Chonghuan Zhang, Qianghua Lin, Biwei Zhu, Haopeng Yang, Xiao Lian, Hao Deng, Jiajun Zheng, Kuangbiao Liao

View PDF HTML (experimental)

Abstract:The field of natural language processing (NLP) has witnessed a transformative shift with the emergence of large language models (LLMs), revolutionizing various language tasks and applications, and the integration of LLM into specialized domains enhances their capabilities for domain-specific applications. Notably, NLP has made significant strides in organic chemistry, particularly in predicting synthetic tasks, paving the way for the development of LLMs tailored to the organic chemistry field. In this work, we introduce SynAsk, a comprehensive organic chemistry domain-specific LLM platform developed by AIChemEco Inc. By finetuning an LLM with domain-specific data and integrating it with a chain of thought approach, SynAsk seamlessly accesses our knowledge base and advanced chemistry tools in a question-and-answer format. This includes functionalities such as a basic chemistry knowledge base, molecular information retrieval, reaction performance prediction, retrosynthesis prediction, chemical literature acquisition, and more. This novel methodology synergizes fine-tuning techniques with external resource integration, resulting in an organic chemistry-specific model poised to facilitate research and discovery in the field. Accessible via this http URL, SynAsk represents a significant advancement in leveraging NLP for synthetic applications.

Subjects:	Chemical Physics (physics.chem-ph); Biomolecules (q-bio.BM)
Cite as:	arXiv:2406.04593 [physics.chem-ph]
	(or arXiv:2406.04593v2 [physics.chem-ph] for this version)
	https://doi.org/10.48550/arXiv.2406.04593

Submission history

From: Chonghuan Zhang [view email]
[v1] Fri, 7 Jun 2024 02:58:12 UTC (31,196 KB)
[v2] Fri, 14 Jun 2024 03:36:32 UTC (15,337 KB)

Physics > Chemical Physics

Title:SynAsk: Unleashing the Power of Large Language Models in Organic Synthesis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Physics > Chemical Physics

Title:SynAsk: Unleashing the Power of Large Language Models in Organic Synthesis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators