• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于分子晶体迁移学习的通用基础模型。

A universal foundation model for transfer learning in molecular crystals.

作者信息

Feng Minggao, Zhao Chengxi, Day Graeme M, Evangelopoulos Xenophon, Cooper Andrew I

机构信息

Materials Innovation Factory and Department of Chemistry, University of Liverpool Liverpool UK

School of Chemistry and Chemical Engineering, University of Southampton Southampton UK

出版信息

Chem Sci. 2025 May 21. doi: 10.1039/d5sc00677e.

DOI:10.1039/d5sc00677e
PMID:40538896
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12174991/
Abstract

The physical and chemical properties of molecular crystals are a combined function of molecular structure and the molecular crystal packing. Specific crystal packings can enable applications such as pharmaceuticals, organic electronics, and porous materials for gas storage. However, to design such materials, we need to predict both crystal structure and the resulting physical properties, and this is expensive using traditional computational methods. Machine-learned interatomic potential methods offer major accelerations here, but molecular crystal structure prediction remains challenging due to the weak intermolecular interactions that dictate crystal packing. Moreover, machine-learned interatomic potentials do not accelerate the prediction of all physical properties for molecular crystals. Here we present Molecular Crystal Representation from Transformers (MCRT), a transformer-based model for molecular crystal property prediction that is pre-trained on 706 126 experimental crystal structures extracted from the Cambridge Structural Database (CSD). MCRT employs four different pre-training tasks to extract both local and global representations from the crystals using multi-modal features to encode crystal structure and geometry. MCRT has the potential to serve as a universal foundation model for predicting a range of properties for molecular crystals, achieving state-of-the-art results even when fine-tuned on small-scale datasets. We demonstrate MCRT's practical utility in both crystal property prediction and crystal structure prediction. We also show that model predictions can be interpreted by using attention scores.

摘要

分子晶体的物理和化学性质是分子结构与分子晶体堆积的综合函数。特定的晶体堆积能够实现诸如药物、有机电子学以及用于气体储存的多孔材料等应用。然而,要设计此类材料,我们需要预测晶体结构以及由此产生的物理性质,而使用传统计算方法进行预测成本高昂。机器学习的原子间势方法在此处能大幅加速计算,但由于决定晶体堆积的分子间相互作用较弱,分子晶体结构预测仍然具有挑战性。此外,机器学习的原子间势并不能加速分子晶体所有物理性质的预测。在此,我们提出了基于变压器的分子晶体性质预测模型——分子晶体变压器表示(MCRT),该模型在从剑桥结构数据库(CSD)提取的706126个实验晶体结构上进行了预训练。MCRT采用四种不同的预训练任务,利用多模态特征对晶体结构和几何形状进行编码,从晶体中提取局部和全局表示。MCRT有潜力作为一个通用基础模型,用于预测分子晶体的一系列性质,即使在小规模数据集上进行微调时也能取得最优结果。我们展示了MCRT在晶体性质预测和晶体结构预测方面的实际效用。我们还表明,可以通过注意力分数来解释模型预测结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b128/12264908/45860210f1d8/d5sc00677e-f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b128/12264908/30af8b29d5e2/d5sc00677e-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b128/12264908/d555d5f3f389/d5sc00677e-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b128/12264908/80a4a84e9a0e/d5sc00677e-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b128/12264908/9c6539cfadab/d5sc00677e-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b128/12264908/45860210f1d8/d5sc00677e-f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b128/12264908/30af8b29d5e2/d5sc00677e-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b128/12264908/d555d5f3f389/d5sc00677e-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b128/12264908/80a4a84e9a0e/d5sc00677e-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b128/12264908/9c6539cfadab/d5sc00677e-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b128/12264908/45860210f1d8/d5sc00677e-f5.jpg

相似文献

1
A universal foundation model for transfer learning in molecular crystals.用于分子晶体迁移学习的通用基础模型。
Chem Sci. 2025 May 21. doi: 10.1039/d5sc00677e.
2
Short-Term Memory Impairment短期记忆障碍
3
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
4
Systemic Inflammatory Response Syndrome全身炎症反应综合征
5
Sexual Harassment and Prevention Training性骚扰与预防培训
6
Management of urinary stones by experts in stone disease (ESD 2025).结石病专家对尿路结石的管理(2025年结石病专家共识)
Arch Ital Urol Androl. 2025 Jun 30;97(2):14085. doi: 10.4081/aiua.2025.14085.
7
Trajectory-Ordered Objectives for Self-Supervised Representation Learning of Temporal Healthcare Data Using Transformers: Model Development and Evaluation Study.使用Transformer进行时间序列医疗数据自监督表示学习的轨迹有序目标:模型开发与评估研究
JMIR Med Inform. 2025 Jun 4;13:e68138. doi: 10.2196/68138.
8
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
9
Systemic treatments for metastatic cutaneous melanoma.转移性皮肤黑色素瘤的全身治疗
Cochrane Database Syst Rev. 2018 Feb 6;2(2):CD011123. doi: 10.1002/14651858.CD011123.pub2.
10
The Lived Experience of Autistic Adults in Employment: A Systematic Search and Synthesis.成年自闭症患者的就业生活经历:系统检索与综述
Autism Adulthood. 2024 Dec 2;6(4):495-509. doi: 10.1089/aut.2022.0114. eCollection 2024 Dec.

引用本文的文献

1
Accurate and efficient machine learning interatomic potentials for finite temperature modelling of molecular crystals.用于分子晶体有限温度建模的准确高效机器学习原子间势。
Chem Sci. 2025 May 23. doi: 10.1039/d5sc01325a.

本文引用的文献

1
Predictive crystallography at scale: mapping, validating, and learning from 1000 crystal energy landscapes.大规模预测晶体学:绘制、验证并从1000个晶体能量景观中学习。
Faraday Discuss. 2025 Jan 14;256(0):434-458. doi: 10.1039/d4fd00105b.
2
Porous isoreticular non-metal organic frameworks.多孔同构的非金属有机骨架。
Nature. 2024 Jun;630(8015):102-108. doi: 10.1038/s41586-024-07353-9. Epub 2024 May 22.
3
A comprehensive transformer-based approach for high-accuracy gas adsorption predictions in metal-organic frameworks.一种基于变压器的综合方法,用于高精度预测金属有机框架中的气体吸附。
Nat Commun. 2024 Mar 1;15(1):1904. doi: 10.1038/s41467-024-46276-x.
4
Direct prediction of gas adsorption via spatial atom interaction learning.通过空间原子相互作用学习直接预测气体吸附
Nat Commun. 2023 Nov 3;14(1):7043. doi: 10.1038/s41467-023-42863-6.
5
polyBERT: a chemical language model to enable fully machine-driven ultrafast polymer informatics.多聚体 BERT:一种化学语言模型,能够实现完全由机器驱动的超快聚合物信息学。
Nat Commun. 2023 Jul 11;14(1):4099. doi: 10.1038/s41467-023-39868-6.
6
MOFormer: Self-Supervised Transformer Model for Metal-Organic Framework Property Prediction.MOFormer:用于金属有机骨架性质预测的自监督变换模型。
J Am Chem Soc. 2023 Feb 8;145(5):2958-2967. doi: 10.1021/jacs.2c11420. Epub 2023 Jan 27.
7
Accelerating computational discovery of porous solids through improved navigation of energy-structure-function maps.通过改进能量-结构-功能图的导航加速多孔固体的计算发现。
Sci Adv. 2021 Aug 13;7(33). doi: 10.1126/sciadv.abi4763. Print 2021 Aug.
8
Predicting enzymatic reactions with a molecular transformer.用分子变换器预测酶促反应。
Chem Sci. 2021 May 25;12(25):8648-8659. doi: 10.1039/d1sc02362d. eCollection 2021 Jul 1.
9
Geometric landscapes for material discovery within energy-structure-function maps.能量-结构-功能图中用于材料发现的几何景观。
Chem Sci. 2020 Apr 29;11(21):5423-5433. doi: 10.1039/d0sc00049c.
10
Machine learning with persistent homology and chemical word embeddings improves prediction accuracy and interpretability in metal-organic frameworks.基于持久同调与化学词嵌入的机器学习提高了金属有机骨架的预测准确性和可解释性。
Sci Rep. 2021 Apr 26;11(1):8888. doi: 10.1038/s41598-021-88027-8.