• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于文本挖掘和金属有机框架合成预测的ChatGPT化学助手

ChatGPT Chemistry Assistant for Text Mining and the Prediction of MOF Synthesis.

作者信息

Zheng Zhiling, Zhang Oufan, Borgs Christian, Chayes Jennifer T, Yaghi Omar M

机构信息

Department of Chemistry, University of California, Berkeley, California 94720, United States.

Kavli Energy Nanoscience Institute, University of California, Berkeley, California 94720, United States.

出版信息

J Am Chem Soc. 2023 Aug 16;145(32):18048-18062. doi: 10.1021/jacs.3c05819. Epub 2023 Aug 7.

DOI:10.1021/jacs.3c05819
PMID:37548379
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11073615/
Abstract

We use prompt engineering to guide ChatGPT in the automation of text mining of metal-organic framework (MOF) synthesis conditions from diverse formats and styles of the scientific literature. This effectively mitigates ChatGPT's tendency to hallucinate information, an issue that previously made the use of large language models (LLMs) in scientific fields challenging. Our approach involves the development of a workflow implementing three different processes for text mining, programmed by ChatGPT itself. All of them enable parsing, searching, filtering, classification, summarization, and data unification with different trade-offs among labor, speed, and accuracy. We deploy this system to extract 26 257 distinct synthesis parameters pertaining to approximately 800 MOFs sourced from peer-reviewed research articles. This process incorporates our ChemPrompt Engineering strategy to instruct ChatGPT in text mining, resulting in impressive precision, recall, and F1 scores of 90-99%. Furthermore, with the data set built by text mining, we constructed a machine-learning model with over 87% accuracy in predicting MOF experimental crystallization outcomes and preliminarily identifying important factors in MOF crystallization. We also developed a reliable data-grounded MOF chatbot to answer questions about chemical reactions and synthesis procedures. Given that the process of using ChatGPT reliably mines and tabulates diverse MOF synthesis information in a unified format while using only narrative language requiring no coding expertise, we anticipate that our ChatGPT Chemistry Assistant will be very useful across various other chemistry subdisciplines.

摘要

我们利用提示工程来引导ChatGPT自动从各种格式和风格的科学文献中挖掘金属有机框架(MOF)的合成条件。这有效地减轻了ChatGPT产生幻觉信息的倾向,而这个问题此前使得在科学领域使用大型语言模型(LLM)具有挑战性。我们的方法包括开发一个工作流程,该流程实施三种不同的文本挖掘过程,由ChatGPT自身编程。所有这些过程都能够进行解析、搜索、过滤、分类、总结以及数据统一,在人工、速度和准确性之间进行不同的权衡。我们部署这个系统,从同行评审的研究文章中提取与大约800种MOF相关的26257个不同的合成参数。这个过程纳入了我们的化学提示工程策略,以指导ChatGPT进行文本挖掘,从而在精确率、召回率和F1分数方面取得了令人印象深刻的90 - 99%。此外,利用通过文本挖掘构建的数据集,我们构建了一个机器学习模型,在预测MOF实验结晶结果和初步识别MOF结晶中的重要因素方面,准确率超过87%。我们还开发了一个可靠的基于数据的MOF聊天机器人,用于回答有关化学反应和合成程序的问题。鉴于使用ChatGPT的过程能够以统一格式可靠地挖掘和整理各种MOF合成信息,同时仅使用无需编码专业知识的叙述性语言,我们预计我们的ChatGPT化学助手在其他各种化学子学科中将非常有用。

相似文献

1
ChatGPT Chemistry Assistant for Text Mining and the Prediction of MOF Synthesis.用于文本挖掘和金属有机框架合成预测的ChatGPT化学助手
J Am Chem Soc. 2023 Aug 16;145(32):18048-18062. doi: 10.1021/jacs.3c05819. Epub 2023 Aug 7.
2
Quality, Accuracy, and Bias in ChatGPT-Based Summarization of Medical Abstracts.基于 ChatGPT 的医学文摘摘要总结的质量、准确性和偏差。
Ann Fam Med. 2024 Mar-Apr;22(2):113-120. doi: 10.1370/afm.3075.
3
Text summarization with ChatGPT for drug labeling documents.利用 ChatGPT 进行药物标签文件的文本摘要。
Drug Discov Today. 2024 Jun;29(6):104018. doi: 10.1016/j.drudis.2024.104018. Epub 2024 May 7.
4
Fine-tuning large language models for chemical text mining.针对化学文本挖掘对大语言模型进行微调。
Chem Sci. 2024 Jun 7;15(27):10600-10611. doi: 10.1039/d4sc00924j. eCollection 2024 Jul 10.
5
How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.ChatGPT在美国医师执照考试(USMLE)中的表现如何?大语言模型对医学教育和知识评估的影响。
JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.
6
Potential of ChatGPT and GPT-4 for Data Mining of Free-Text CT Reports on Lung Cancer.ChatGPT 和 GPT-4 在挖掘肺癌 CT 报告自由文本数据方面的潜力
Radiology. 2023 Sep;308(3):e231362. doi: 10.1148/radiol.231362.
7
Performance of ChatGPT on the Chinese Postgraduate Examination for Clinical Medicine: Survey Study.ChatGPT 在临床医学研究生入学考试中的表现:调查研究。
JMIR Med Educ. 2024 Feb 9;10:e48514. doi: 10.2196/48514.
8
A critical assessment of using ChatGPT for extracting structured data from clinical notes.对使用ChatGPT从临床记录中提取结构化数据的批判性评估。
NPJ Digit Med. 2024 May 1;7(1):106. doi: 10.1038/s41746-024-01079-8.
9
Evaluating ChatGPT text mining of clinical records for companion animal obesity monitoring.评估 ChatGPT 对临床记录进行挖掘以监测伴侣动物肥胖症。
Vet Rec. 2024 Feb 3;194(3):e3669. doi: 10.1002/vetr.3669. Epub 2023 Dec 6.
10
Mining Insights on Metal-Organic Framework Synthesis from Scientific Literature Texts.从科学文献文本中挖掘金属有机框架合成的见解
J Chem Inf Model. 2022 Mar 14;62(5):1190-1198. doi: 10.1021/acs.jcim.1c01297. Epub 2022 Feb 23.

引用本文的文献

1
Catal-GPT: AI-driven directed efficient design framework for catalysts.催化GPT:用于催化剂的人工智能驱动的定向高效设计框架。
Natl Sci Rev. 2025 Jul 25;12(9):nwaf299. doi: 10.1093/nsr/nwaf299. eCollection 2025 Sep.
2
Synergizing a knowledge graph and large language model for relay catalysis pathway recommendation.将知识图谱与大语言模型相结合用于接力催化途径推荐。
Natl Sci Rev. 2025 Jul 14;12(8):nwaf271. doi: 10.1093/nsr/nwaf271. eCollection 2025 Aug.
3
Steering towards safe self-driving laboratories.转向安全的自动驾驶实验室。

本文引用的文献

1
DigiMOF: A Database of Metal-Organic Framework Synthesis Information Generated via Text Mining.DigiMOF:通过文本挖掘生成的金属有机框架合成信息数据库。
Chem Mater. 2023 May 18;35(11):4510-4524. doi: 10.1021/acs.chemmater.3c00788. eCollection 2023 Jun 13.
2
MOF Linker Extension Strategy for Enhanced Atmospheric Water Harvesting.用于增强大气水收集的金属有机框架连接体扩展策略
ACS Cent Sci. 2023 Mar 6;9(3):551-557. doi: 10.1021/acscentsci.3c00018. eCollection 2023 Mar 22.
3
MOFSimplify, machine learning models with extracted stability data of three thousand metal-organic frameworks.
Nat Rev Chem. 2025 Aug 18. doi: 10.1038/s41570-025-00747-x.
4
Annotated textual dataset PV600 of perovskite bandgaps for information extraction from literature.用于从文献中提取信息的钙钛矿带隙注释文本数据集PV600。
Sci Data. 2025 Aug 11;12(1):1401. doi: 10.1038/s41597-025-05637-x.
5
Reasoning Language Model as Rule Finder: A Case Study on C-H Bond Activation Using 2D Metal-Organic Frameworks.作为规则发现者的推理语言模型:基于二维金属有机框架的C-H键活化案例研究
ACS Cent Sci. 2025 Jun 13;11(7):1135-1146. doi: 10.1021/acscentsci.5c00561. eCollection 2025 Jul 23.
6
Accurate prediction of synthesizability and precursors of 3D crystal structures via large language models.通过大语言模型准确预测3D晶体结构的可合成性和前体。
Nat Commun. 2025 Jul 15;16(1):6530. doi: 10.1038/s41467-025-61778-y.
7
Artificial Intelligence Paradigms for Next-Generation Metal-Organic Framework Research.面向下一代金属有机框架研究的人工智能范式
J Am Chem Soc. 2025 Jul 9;147(27):23367-23380. doi: 10.1021/jacs.5c08214. Epub 2025 Jun 24.
8
Pore engineering in metal-organic frameworks and covalent organic frameworks: strategies and applications.金属有机框架和共价有机框架中的孔工程:策略与应用
Chem Sci. 2025 Jun 14. doi: 10.1039/d5sc01635e.
9
LLM-assisted literature analysis for plastic upcycling.用于塑料升级再造的大语言模型辅助文献分析
Fundam Res. 2025 Mar 25;5(3):923-926. doi: 10.1016/j.fmre.2025.03.012. eCollection 2025 May.
10
AI Approaches to Homogeneous Catalysis with Transition Metal Complexes.过渡金属配合物均相催化的人工智能方法
ACS Catal. 2025 May 14;15(11):9089-9105. doi: 10.1021/acscatal.5c01202. eCollection 2025 Jun 6.
MOFSimplify,利用三千个金属有机骨架提取的稳定性数据的机器学习模型。
Sci Data. 2022 Mar 11;9(1):74. doi: 10.1038/s41597-022-01181-0.
4
Mining Insights on Metal-Organic Framework Synthesis from Scientific Literature Texts.从科学文献文本中挖掘金属有机框架合成的见解
J Chem Inf Model. 2022 Mar 14;62(5):1190-1198. doi: 10.1021/acs.jcim.1c01297. Epub 2022 Feb 23.
5
MOF Synthesis Prediction Enabled by Automatic Data Mining and Machine Learning.基于自动数据挖掘和机器学习的 MOF 合成预测。
Angew Chem Int Ed Engl. 2022 May 2;61(19):e202200242. doi: 10.1002/anie.202200242. Epub 2022 Mar 10.
6
Water Sorption Evolution Enabled by Reticular Construction of Zirconium Metal-Organic Frameworks Based on a Unique [2.2]Paracyclophane Scaffold.基于独特的[2.2]对环芳烷骨架的网状结构实现的锆基金属-有机框架的吸水演变。
J Am Chem Soc. 2022 Feb 2;144(4):1826-1834. doi: 10.1021/jacs.1c11836. Epub 2022 Jan 21.
7
Using Machine Learning and Data Mining to Leverage Community Knowledge for the Engineering of Stable Metal-Organic Frameworks.利用机器学习和数据挖掘技术,利用社区知识来设计稳定的金属有机骨架。
J Am Chem Soc. 2021 Oct 27;143(42):17535-17547. doi: 10.1021/jacs.1c07217. Epub 2021 Oct 13.
8
Random forest classification for predicting lifespan-extending chemical compounds.随机森林分类法预测延长寿命的化合物。
Sci Rep. 2021 Jul 5;11(1):13812. doi: 10.1038/s41598-021-93070-6.
9
The application of machine learning for predicting the methane uptake and working capacity of MOFs.机器学习在预测金属有机框架材料(MOFs)的甲烷吸附量和工作容量方面的应用。
Faraday Discuss. 2021 Oct 15;231(0):224-234. doi: 10.1039/d1fd00011j.
10
The rise of intelligent matter.智能物质的兴起。
Nature. 2021 Jun;594(7863):345-355. doi: 10.1038/s41586-021-03453-y. Epub 2021 Jun 16.