• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

变压器语言模型中材料特性的准确、可解释预测。

Accurate, interpretable predictions of materials properties within transformer language models.

作者信息

Korolev Vadim, Protsenko Pavel

机构信息

Department of Chemistry, Lomonosov Moscow State University, 119991 Moscow, Russia.

出版信息

Patterns (N Y). 2023 Aug 2;4(10):100803. doi: 10.1016/j.patter.2023.100803. eCollection 2023 Oct 13.

DOI:10.1016/j.patter.2023.100803
PMID:37876904
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10591138/
Abstract

Property prediction accuracy has long been a key parameter of machine learning in materials informatics. Accordingly, advanced models showing state-of-the-art performance turn into highly parameterized black boxes missing interpretability. Here, we present an elegant way to make their reasoning transparent. Human-readable text-based descriptions automatically generated within a suite of open-source tools are proposed as materials representation. Transformer language models pretrained on 2 million peer-reviewed articles take as input well-known terms such as chemical composition, crystal symmetry, and site geometry. Our approach outperforms crystal graph networks by classifying four out of five analyzed properties if one considers all available reference data. Moreover, fine-tuned text-based models show high accuracy in the ultra-small data limit. Explanations of their internal machinery are produced using local interpretability techniques and are faithful and consistent with domain expert rationales. This language-centric framework makes accurate property predictions accessible to people without artificial-intelligence expertise.

摘要

长期以来,属性预测准确性一直是材料信息学中机器学习的关键参数。因此,表现出最先进性能的先进模型变成了缺乏可解释性的高度参数化黑箱。在这里,我们提出了一种使它们的推理透明的巧妙方法。我们建议在一套开源工具中自动生成的基于文本的人类可读描述作为材料表示。在200万篇同行评审文章上预训练的Transformer语言模型将化学成分、晶体对称性和位点几何等众所周知的术语作为输入。如果考虑所有可用的参考数据,我们的方法在分析的五个属性中有四个属性的分类方面优于晶体图网络。此外,经过微调的基于文本的模型在超小数据限制下显示出高精度。使用局部可解释性技术对其内部机制进行解释,这些解释忠实且与领域专家的原理一致。这个以语言为中心的框架使没有人工智能专业知识的人也能进行准确的属性预测。

相似文献

1
Accurate, interpretable predictions of materials properties within transformer language models.变压器语言模型中材料特性的准确、可解释预测。
Patterns (N Y). 2023 Aug 2;4(10):100803. doi: 10.1016/j.patter.2023.100803. eCollection 2023 Oct 13.
2
Toward explainable AI (XAI) for mental health detection based on language behavior.迈向基于语言行为的可解释人工智能(XAI)用于心理健康检测。
Front Psychiatry. 2023 Dec 7;14:1219479. doi: 10.3389/fpsyt.2023.1219479. eCollection 2023.
3
DeepXplainer: An interpretable deep learning based approach for lung cancer detection using explainable artificial intelligence.深演析:一种基于可解释人工智能的用于肺癌检测的可解释深度学习方法。
Comput Methods Programs Biomed. 2024 Jan;243:107879. doi: 10.1016/j.cmpb.2023.107879. Epub 2023 Oct 24.
4
Transformers-sklearn: a toolkit for medical language understanding with transformer-based models.Transformer-sklearn:一个基于 Transformer 的模型的医学语言理解工具包。
BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):90. doi: 10.1186/s12911-021-01459-0.
5
A Machine Learning Approach with Human-AI Collaboration for Automated Classification of Patient Safety Event Reports: Algorithm Development and Validation Study.一种人机协作的机器学习方法用于患者安全事件报告的自动分类:算法开发与验证研究
JMIR Hum Factors. 2024 Jan 25;11:e53378. doi: 10.2196/53378.
6
Pretrained Transformer Language Models Versus Pretrained Word Embeddings for the Detection of Accurate Health Information on Arabic Social Media: Comparative Study.用于在阿拉伯社交媒体上检测准确健康信息的预训练Transformer语言模型与预训练词嵌入:比较研究
JMIR Form Res. 2022 Jun 29;6(6):e34834. doi: 10.2196/34834.
7
A Mobile App That Addresses Interpretability Challenges in Machine Learning-Based Diabetes Predictions: Survey-Based User Study.一款应对基于机器学习的糖尿病预测中可解释性挑战的移动应用程序:基于调查的用户研究。
JMIR Form Res. 2023 Nov 13;7:e50328. doi: 10.2196/50328.
8
Acute myocardial infarction prognosis prediction with reliable and interpretable artificial intelligence system.利用可靠且可解释的人工智能系统预测急性心肌梗死预后。
J Am Med Inform Assoc. 2024 Jun 20;31(7):1540-1550. doi: 10.1093/jamia/ocae114.
9
Knowledge-enhanced Graph Topic Transformer for Explainable Biomedical Text Summarization.用于可解释生物医学文本摘要的知识增强图主题变换器
IEEE J Biomed Health Inform. 2023 Aug 23;PP. doi: 10.1109/JBHI.2023.3308064.
10
Explainable Machine Learning Framework for Image Classification Problems: Case Study on Glioma Cancer Prediction.用于图像分类问题的可解释机器学习框架:脑胶质瘤癌症预测案例研究
J Imaging. 2020 May 28;6(6):37. doi: 10.3390/jimaging6060037.

引用本文的文献

1
Cross-disciplinary perspectives on the potential for artificial intelligence across chemistry.关于人工智能在化学领域潜力的跨学科观点。
Chem Soc Rev. 2025 Apr 25. doi: 10.1039/d5cs00146c.
2
The carbon footprint of predicting CO storage capacity in metal-organic frameworks within neural networks.神经网络中预测金属有机框架内CO存储容量的碳足迹。
iScience. 2024 Mar 29;27(5):109644. doi: 10.1016/j.isci.2024.109644. eCollection 2024 May 17.

本文引用的文献

1
A universal graph deep learning interatomic potential for the periodic table.一种用于元素周期表的通用图深度学习原子间势能。
Nat Comput Sci. 2022 Nov;2(11):718-728. doi: 10.1038/s43588-022-00349-3. Epub 2022 Nov 28.
2
ChatGPT: five priorities for research.ChatGPT:研究的五个优先事项。
Nature. 2023 Feb;614(7947):224-226. doi: 10.1038/d41586-023-00288-7.
3
Graph neural networks for materials science and chemistry.用于材料科学与化学的图神经网络
Commun Mater. 2022;3(1):93. doi: 10.1038/s43246-022-00315-6. Epub 2022 Nov 26.
4
Interpretable Graph Transformer Network for Predicting Adsorption Isotherms of Metal-Organic Frameworks.可解释图Transformer 网络预测金属有机骨架的吸附等温线。
J Chem Inf Model. 2022 Nov 28;62(22):5446-5456. doi: 10.1021/acs.jcim.2c00876. Epub 2022 Nov 1.
5
On scientific understanding with artificial intelligence.论人工智能辅助下的科学理解
Nat Rev Phys. 2022;4(12):761-769. doi: 10.1038/s42254-022-00518-3. Epub 2022 Oct 11.
6
BIGDML-Towards accurate quantum machine learning force fields for materials.BIGDML——迈向精确的材料量子机器学习力场
Nat Commun. 2022 Jun 29;13(1):3733. doi: 10.1038/s41467-022-31093-x.
7
Scalable deeper graph neural networks for high-performance materials property prediction.用于高性能材料性能预测的可扩展深度图神经网络
Patterns (N Y). 2022 Apr 27;3(5):100491. doi: 10.1016/j.patter.2022.100491. eCollection 2022 May 13.
8
BatteryBERT: A Pretrained Language Model for Battery Database Enhancement.电池 BERT:用于电池数据库增强的预训练语言模型。
J Chem Inf Model. 2022 Dec 26;62(24):6365-6377. doi: 10.1021/acs.jcim.2c00035. Epub 2022 May 9.
9
Machine learning of material properties: Predictive and interpretable multilinear models.材料性能的机器学习:预测性和可解释的多线性模型。
Sci Adv. 2022 May 6;8(18):eabm7185. doi: 10.1126/sciadv.abm7185.
10
Quantifying the advantage of domain-specific pre-training on named entity recognition tasks in materials science.量化特定领域预训练在材料科学命名实体识别任务中的优势。
Patterns (N Y). 2022 Apr 8;3(4):100488. doi: 10.1016/j.patter.2022.100488.