Suppr超能文献

SynAsk:释放大语言模型在有机合成中的力量。

SynAsk: unleashing the power of large language models in organic synthesis.

作者信息

Zhang Chonghuan, Lin Qianghua, Zhu Biwei, Yang Haopeng, Lian Xiao, Deng Hao, Zheng Jiajun, Liao Kuangbiao

机构信息

Guangzhou National Laboratory Guangzhou Guangdong 510005 PR China

AIChemEco Inc. Guangzhou Guangdong 510005 PR China.

出版信息

Chem Sci. 2024 Nov 18;16(1):43-56. doi: 10.1039/d4sc04757e. eCollection 2024 Dec 18.

Abstract

The field of natural language processing (NLP) has witnessed a transformative shift with the emergence of large language models (LLMs), revolutionizing various language tasks and applications, and the integration of LLMs into specialized domains enhances their capabilities for domain-specific applications. Notably, NLP has made significant strides in organic chemistry, particularly in predicting synthetic tasks, paving the way for the development of LLMs tailored to the organic chemistry field. In this work, we introduce SynAsk, a comprehensive organic chemistry domain-specific LLM platform developed by AIChemEco Inc. By fine-tuning an LLM with domain-specific data and integrating it with a chain of thought approach, SynAsk seamlessly accesses our knowledge base and advanced chemistry tools in a question-and-answer format. This includes functionalities such as a basic chemistry knowledge base, molecular information retrieval, reaction performance prediction, retrosynthesis prediction, chemical literature acquisition, and more. This novel methodology synergizes fine-tuning techniques with external resource integration, resulting in an organic chemistry-specific model poised to facilitate research and discovery in the field. Accessible at https://synask.aichemeco.com, SynAsk represents a significant advancement in leveraging NLP for synthetic applications.

摘要

随着大语言模型(LLMs)的出现,自然语言处理(NLP)领域经历了变革性的转变,彻底改变了各种语言任务和应用,并且将大语言模型集成到专业领域增强了它们在特定领域应用的能力。值得注意的是,NLP在有机化学领域取得了重大进展,特别是在预测合成任务方面,为开发针对有机化学领域的大语言模型铺平了道路。在这项工作中,我们介绍了SynAsk,这是AIChemEco公司开发的一个全面的有机化学领域特定的大语言模型平台。通过使用特定领域的数据对大语言模型进行微调,并将其与思维链方法相结合,SynAsk以问答形式无缝访问我们的知识库和先进的化学工具。这包括诸如基础化学知识库、分子信息检索、反应性能预测、逆合成预测、化学文献获取等功能。这种新颖的方法将微调技术与外部资源整合相结合,产生了一个针对有机化学的模型,有望促进该领域的研究和发现。可在https://synask.aichemeco.com访问,SynAsk代表了在利用NLP进行合成应用方面的重大进步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b98/11653367/f2a2310f0918/d4sc04757e-f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验