Suppr超能文献

基于广义图神经网络的模型,可准确预测共晶密度,并从数据质量和特征表示中获得深入见解。

General Graph Neural Network-Based Model To Accurately Predict Cocrystal Density and Insight from Data Quality and Feature Representation.

机构信息

College of Chemistry, Sichuan University, Chengdu610064, People's Republic of China.

Institute of Chemical Materials, China Academy of Engineering Physics, Mianyang621900, China.

出版信息

J Chem Inf Model. 2023 Feb 27;63(4):1143-1156. doi: 10.1021/acs.jcim.2c01538. Epub 2023 Feb 3.

Abstract

Cocrystal engineering as an effective way to modify solid-state properties has inspired great interest from diverse material fields while cocrystal density is an important property closely correlated with the material function. In order to accurately predict the cocrystal density, we develop a graph neural network (GNN)-based deep learning framework by considering three key factors of machine learning (data quality, feature presentation, and model architecture). The result shows that different stoichiometric ratios of molecules in cocrystals can significantly influence the prediction performances, highlighting the importance of data quality. In addition, the feature complementary is not suitable for augmenting the molecular graph representation in the cocrystal density prediction, suggesting that the complementary strategy needs to consider whether extra features can sufficiently supplement the lacked information in the original representation. Based on these results, 4144 cocrystals with 1:1 stoichiometry ratio are selected as the dataset, supplemented by the data augmentation of exchanging a pair of coformers. The molecular graph is determined to learn feature representation to train the GNN-based model. Global attention is introduced to further optimize the feature space and identify important atoms to realize the interpretability of the model. Benefited from the advantages, our model significantly outperforms three competitive models and exhibits high prediction accuracy for unseen cocrystals, showcasing its robustness and generality. Overall, our work not only provides a general cocrystal density prediction tool for experimental investigations but also provides useful guidelines for the machine learning application. All source codes are freely available at https://github.com/Xiao-Gua00/CCPGraph.

摘要

共晶工程作为一种有效调节固态性质的方法,在不同的材料领域引起了广泛的关注,而共晶密度是与材料功能密切相关的重要性质。为了准确预测共晶密度,我们考虑到机器学习的三个关键因素(数据质量、特征表示和模型架构),开发了一种基于图神经网络(GNN)的深度学习框架。结果表明,共晶中分子的不同化学计量比对预测性能有显著影响,突出了数据质量的重要性。此外,特征互补并不适合增强共晶密度预测中分子图的表示,这表明互补策略需要考虑额外的特征是否能够充分补充原始表示中缺失的信息。基于这些结果,我们选择了 4144 个具有 1:1 化学计量比的共晶作为数据集,并通过交换一对共晶体的方式进行数据增强。分子图被确定为学习特征表示,以训练基于 GNN 的模型。全局注意力被引入以进一步优化特征空间并识别重要原子,从而实现模型的可解释性。得益于这些优势,我们的模型在未见的共晶预测方面表现出了较高的准确性,明显优于三个竞争模型,展示了其稳健性和通用性。总的来说,我们的工作不仅为实验研究提供了一种通用的共晶密度预测工具,而且为机器学习的应用提供了有用的指导。所有的源代码都可以在 https://github.com/Xiao-Gua00/CCPGraph 上免费获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验