Suppr超能文献

基于微分几何的分子数据集的几何学习。

DG-GL: Differential geometry-based geometric learning of molecular datasets.

机构信息

Department of Mathematics, Michigan State University, East Lansing, 48824, Michigan.

Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA.

出版信息

Int J Numer Method Biomed Eng. 2019 Mar;35(3):e3179. doi: 10.1002/cnm.3179. Epub 2019 Feb 7.

Abstract

MOTIVATION

Despite its great success in various physical modeling, differential geometry (DG) has rarely been devised as a versatile tool for analyzing large, diverse, and complex molecular and biomolecular datasets because of the limited understanding of its potential power in dimensionality reduction and its ability to encode essential chemical and biological information in differentiable manifolds.

RESULTS

We put forward a differential geometry-based geometric learning (DG-GL) hypothesis that the intrinsic physics of three-dimensional (3D) molecular structures lies on a family of low-dimensional manifolds embedded in a high-dimensional data space. We encode crucial chemical, physical, and biological information into 2D element interactive manifolds, extracted from a high-dimensional structural data space via a multiscale discrete-to-continuum mapping using differentiable density estimators. Differential geometry apparatuses are utilized to construct element interactive curvatures in analytical forms for certain analytically differentiable density estimators. These low-dimensional differential geometry representations are paired with a robust machine learning algorithm to showcase their descriptive and predictive powers for large, diverse, and complex molecular and biomolecular datasets. Extensive numerical experiments are carried out to demonstrate that the proposed DG-GL strategy outperforms other advanced methods in the predictions of drug discovery-related protein-ligand binding affinity, drug toxicity, and molecular solvation free energy.

AVAILABILITY AND IMPLEMENTATION

http://weilab.math.msu.edu/DG-GL/ Contact: wei@math.msu.edu.

摘要

动机

尽管微分几何(DG)在各种物理建模中取得了巨大的成功,但由于对其在降维和编码重要化学和生物学信息方面的潜在能力的理解有限,它很少被设计为分析大型、多样化和复杂的分子和生物分子数据集的通用工具。

结果

我们提出了一个基于微分几何的几何学习(DG-GL)假设,即三维(3D)分子结构的内在物理性质位于嵌入在高维数据空间中的低维流形族上。我们通过使用可微密度估计器的多尺度离散到连续映射,将关键的化学、物理和生物学信息编码到从高维结构数据空间提取的 2D 元素交互流形中。微分几何仪器用于构建某些解析可微密度估计器的解析形式的元素交互曲率。这些低维微分几何表示与强大的机器学习算法相结合,展示了它们在大型、多样化和复杂的分子和生物分子数据集的描述和预测能力。进行了广泛的数值实验,以证明所提出的 DG-GL 策略在预测与药物发现相关的蛋白-配体结合亲和力、药物毒性和分子溶剂化自由能方面优于其他先进方法。

可用性和实现

http://weilab.math.msu.edu/DG-GL/ 联系:wei@math.msu.edu

相似文献

1
DG-GL: Differential geometry-based geometric learning of molecular datasets.
Int J Numer Method Biomed Eng. 2019 Mar;35(3):e3179. doi: 10.1002/cnm.3179. Epub 2019 Feb 7.
2
TopologyNet: Topology based deep convolutional and multi-task neural networks for biomolecular property predictions.
PLoS Comput Biol. 2017 Jul 27;13(7):e1005690. doi: 10.1371/journal.pcbi.1005690. eCollection 2017 Jul.
3
AGL-Score: Algebraic Graph Learning Score for Protein-Ligand Binding Scoring, Ranking, Docking, and Screening.
J Chem Inf Model. 2019 Jul 22;59(7):3291-3304. doi: 10.1021/acs.jcim.9b00334. Epub 2019 Jul 1.
4
Multiscale differential geometry learning of networks with applications to single-cell RNA sequencing data.
Comput Biol Med. 2024 Mar;171:108211. doi: 10.1016/j.compbiomed.2024.108211. Epub 2024 Feb 28.
5
A review of mathematical representations of biomolecular data.
Phys Chem Chem Phys. 2020 Feb 26;22(8):4343-4367. doi: 10.1039/c9cp06554g.
6
Quantitative Toxicity Prediction Using Topology Based Multitask Deep Neural Networks.
J Chem Inf Model. 2018 Feb 26;58(2):520-531. doi: 10.1021/acs.jcim.7b00558. Epub 2018 Jan 31.
7
Differential geometry based solvation model II: Lagrangian formulation.
J Math Biol. 2011 Dec;63(6):1139-200. doi: 10.1007/s00285-011-0402-z. Epub 2011 Jan 30.
8
EISA-Score: Element Interactive Surface Area Score for Protein-Ligand Binding Affinity Prediction.
J Chem Inf Model. 2022 Sep 26;62(18):4329-4341. doi: 10.1021/acs.jcim.2c00697. Epub 2022 Sep 15.
9
Parameter optimization in differential geometry based solvation models.
J Chem Phys. 2015 Oct 7;143(13):134119. doi: 10.1063/1.4932342.
10
Persistent Cohomology for Data With Multicomponent Heterogeneous Information.
SIAM J Math Data Sci. 2020;2(2):396-418. doi: 10.1137/19m1272226. Epub 2020 May 19.

引用本文的文献

1
3
Persistent Directed Flag Laplacian (PDFL)-Based Machine Learning for Protein-Ligand Binding Affinity Prediction.
J Chem Theory Comput. 2025 Apr 22;21(8):4276-4285. doi: 10.1021/acs.jctc.5c00074. Epub 2025 Apr 5.
5
Multiscale Differential Geometry Learning for Protein Flexibility Analysis.
J Comput Chem. 2025 Mar 15;46(7):e70073. doi: 10.1002/jcc.70073.
7
Regression Study of Odorant Chemical Space, Molecular Structural Diversity, and Natural Language Description.
ACS Omega. 2024 Jun 3;9(23):25054-25062. doi: 10.1021/acsomega.4c02268. eCollection 2024 Jun 11.
8
Multiscale differential geometry learning of networks with applications to single-cell RNA sequencing data.
Comput Biol Med. 2024 Mar;171:108211. doi: 10.1016/j.compbiomed.2024.108211. Epub 2024 Feb 28.
9
Multiscale Laplacian Learning.
Appl Intell (Dordr). 2023 Jun;53(12):15727-15746. doi: 10.1007/s10489-022-04333-2. Epub 2022 Nov 28.
10
Multi-shelled ECIF: improved extended connectivity interaction features for accurate binding affinity prediction.
Bioinform Adv. 2023 Oct 20;3(1):vbad155. doi: 10.1093/bioadv/vbad155. eCollection 2023.

本文引用的文献

1
Development and evaluation of a deep learning model for protein-ligand binding affinity prediction.
Bioinformatics. 2018 Nov 1;34(21):3666-3674. doi: 10.1093/bioinformatics/bty374.
2
Multiscale weighted colored graphs for protein flexibility and rigidity analysis.
J Chem Phys. 2018 Feb 7;148(5):054103. doi: 10.1063/1.5016562.
3
Quantitative Toxicity Prediction Using Topology Based Multitask Deep Neural Networks.
J Chem Inf Model. 2018 Feb 26;58(2):520-531. doi: 10.1021/acs.jcim.7b00558. Epub 2018 Jan 31.
4
K: Protein-Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks.
J Chem Inf Model. 2018 Feb 26;58(2):287-296. doi: 10.1021/acs.jcim.7b00650. Epub 2018 Jan 29.
5
Breaking the polar-nonpolar division in solvation free energy prediction.
J Comput Chem. 2018 Feb 5;39(4):217-233. doi: 10.1002/jcc.25107. Epub 2017 Nov 11.
6
Analysis and prediction of protein folding energy changes upon mutation by element specific persistent homology.
Bioinformatics. 2017 Nov 15;33(22):3549-3557. doi: 10.1093/bioinformatics/btx460.
7
TopologyNet: Topology based deep convolutional and multi-task neural networks for biomolecular property predictions.
PLoS Comput Biol. 2017 Jul 27;13(7):e1005690. doi: 10.1371/journal.pcbi.1005690. eCollection 2017 Jul.
8
Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction.
Int J Numer Method Biomed Eng. 2018 Feb;34(2). doi: 10.1002/cnm.2914. Epub 2017 Aug 16.
9
Rigidity Strengthening: A Mechanism for Protein-Ligand Binding.
J Chem Inf Model. 2017 Jul 24;57(7):1715-1721. doi: 10.1021/acs.jcim.7b00226. Epub 2017 Jul 12.
10
The impact of surface area, volume, curvature, and Lennard-Jones potential to solvation modeling.
J Comput Chem. 2017 Jan 5;38(1):24-36. doi: 10.1002/jcc.24512. Epub 2016 Oct 8.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验