Suppr超能文献

基于多覆盖持久性(MCP)的机器学习在聚合物性能预测中的应用。

Multi-Cover Persistence (MCP)-based machine learning for polymer property prediction.

机构信息

Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore.

Department of Mathematics, National University of Singapore, Singapore 119076, Singapore.

出版信息

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae465.

Abstract

Accurate and efficient prediction of polymers properties is crucial for polymer design. Recently, data-driven artificial intelligence (AI) models have demonstrated great promise in polymers property analysis. Even with the great progresses, a pivotal challenge in all the AI-driven models remains to be the effective representation of molecules. Here we introduce Multi-Cover Persistence (MCP)-based molecular representation and featurization for the first time. Our MCP-based polymer descriptors are combined with machine learning models, in particular, Gradient Boosting Tree (GBT) models, for polymers property prediction. Different from all previous molecular representation, polymer molecular structure and interactions are represented as MCP, which utilizes Delaunay slices at different dimensions and Rhomboid tiling to characterize the complicated geometric and topological information within the data. Statistic features from the generated persistent barcodes are used as polymer descriptors, and further combined with GBT model. Our model has been extensively validated on polymer benchmark datasets. It has been found that our models can outperform traditional fingerprint-based models and has similar accuracy with geometric deep learning models. In particular, our model tends to be more effective on large-sized monomer structures, demonstrating the great potential of MCP in characterizing more complicated polymer data. This work underscores the potential of MCP in polymer informatics, presenting a novel perspective on molecular representation and its application in polymer science.

摘要

准确高效地预测聚合物性能对于聚合物设计至关重要。最近,数据驱动的人工智能 (AI) 模型在聚合物性能分析方面展现出了巨大的潜力。尽管已经取得了很大的进展,但所有 AI 驱动的模型中的一个关键挑战仍然是有效表示分子。在这里,我们首次引入了基于多重覆盖持久 (MCP) 的分子表示和特征化方法。我们的基于 MCP 的聚合物描述符与机器学习模型(特别是梯度提升树 (GBT) 模型)相结合,用于预测聚合物性能。与所有以前的分子表示方法不同,聚合物分子结构和相互作用被表示为 MCP,它利用不同维度的 Delaunay 切片和菱形平铺来描述数据中复杂的几何和拓扑信息。从生成的持久条形码中提取的统计特征被用作聚合物描述符,并进一步与 GBT 模型相结合。我们的模型已经在聚合物基准数据集上进行了广泛的验证。结果表明,我们的模型可以优于传统的基于指纹的模型,并且与几何深度学习模型具有相似的准确性。特别是,我们的模型在大型单体结构上效果更好,这表明 MCP 在表征更复杂的聚合物数据方面具有很大的潜力。这项工作强调了 MCP 在聚合物信息学中的潜力,为分子表示及其在聚合物科学中的应用提供了新的视角。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b42/11424509/9965a5a5a8b0/bbae465f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验