• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过机器学习直接生成蛋白质构象集合。

Direct generation of protein conformational ensembles via machine learning.

机构信息

Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA.

出版信息

Nat Commun. 2023 Feb 11;14(1):774. doi: 10.1038/s41467-023-36443-x.

DOI:10.1038/s41467-023-36443-x
PMID:36774359
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9922302/
Abstract

Dynamics and conformational sampling are essential for linking protein structure to biological function. While challenging to probe experimentally, computer simulations are widely used to describe protein dynamics, but at significant computational costs that continue to limit the systems that can be studied. Here, we demonstrate that machine learning can be trained with simulation data to directly generate physically realistic conformational ensembles of proteins without the need for any sampling and at negligible computational cost. As a proof-of-principle we train a generative adversarial network based on a transformer architecture with self-attention on coarse-grained simulations of intrinsically disordered peptides. The resulting model, idpGAN, can predict sequence-dependent coarse-grained ensembles for sequences that are not present in the training set demonstrating that transferability can be achieved beyond the limited training data. We also retrain idpGAN on atomistic simulation data to show that the approach can be extended in principle to higher-resolution conformational ensemble generation.

摘要

动力学和构象采样对于将蛋白质结构与生物功能联系起来至关重要。虽然实验探测具有挑战性,但计算机模拟被广泛用于描述蛋白质动力学,但计算成本很高,这仍然限制了可以研究的系统。在这里,我们证明可以使用模拟数据对机器学习进行训练,以直接生成蛋白质的物理上真实的构象集合,而无需任何采样,并且计算成本可以忽略不计。作为原理验证,我们使用基于带有自注意力的转换器架构的生成式对抗网络对内在无序肽的粗粒度模拟进行训练。由此产生的模型 idpGAN 可以预测不在训练集中的序列的依赖于序列的粗粒度集合,这表明可以在有限的训练数据之外实现可转移性。我们还在原子模拟数据上重新训练 idpGAN,以表明该方法原则上可以扩展到更高分辨率的构象集合生成。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/b7ed3d082247/41467_2023_36443_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/9cc5aa20b897/41467_2023_36443_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/16f83722799b/41467_2023_36443_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/8241d94ff8ed/41467_2023_36443_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/547cfded4ee4/41467_2023_36443_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/1f1864fcbd3e/41467_2023_36443_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/a3a1bc86358a/41467_2023_36443_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/b7ed3d082247/41467_2023_36443_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/9cc5aa20b897/41467_2023_36443_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/16f83722799b/41467_2023_36443_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/8241d94ff8ed/41467_2023_36443_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/547cfded4ee4/41467_2023_36443_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/1f1864fcbd3e/41467_2023_36443_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/a3a1bc86358a/41467_2023_36443_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6447/9922302/b7ed3d082247/41467_2023_36443_Fig7_HTML.jpg

相似文献

1
Direct generation of protein conformational ensembles via machine learning.通过机器学习直接生成蛋白质构象集合。
Nat Commun. 2023 Feb 11;14(1):774. doi: 10.1038/s41467-023-36443-x.
2
Machine-learning-based methods to generate conformational ensembles of disordered proteins.基于机器学习的方法生成无序蛋白质的构象集合。
Biophys J. 2024 Jan 2;123(1):101-113. doi: 10.1016/j.bpj.2023.12.001. Epub 2023 Dec 5.
3
Transferable deep generative modeling of intrinsically disordered protein conformations.可转移的深度生成模型对固有无序蛋白质构象的建模。
PLoS Comput Biol. 2024 May 23;20(5):e1012144. doi: 10.1371/journal.pcbi.1012144. eCollection 2024 May.
4
Machine Learning Generation of Dynamic Protein Conformational Ensembles.机器学习生成动态蛋白质构象集合。
Molecules. 2023 May 12;28(10):4047. doi: 10.3390/molecules28104047.
5
Exploring the Conformational Ensembles of Protein-Protein Complex with Transformer-Based Generative Model.基于生成式模型的蛋白质-蛋白质复合物构象集合研究。
J Chem Theory Comput. 2024 Jun 11;20(11):4469-4480. doi: 10.1021/acs.jctc.4c00255. Epub 2024 May 30.
6
Testing the transferability of a coarse-grained model to intrinsically disordered proteins.测试粗粒度模型对内在无序蛋白质的可转移性。
Phys Chem Chem Phys. 2015 Dec 21;17(47):31741-9. doi: 10.1039/c5cp05652g.
7
Molecular Dynamics Simulations Combined with Nuclear Magnetic Resonance and/or Small-Angle X-ray Scattering Data for Characterizing Intrinsically Disordered Protein Conformational Ensembles.运用分子动力学模拟结合核磁共振和/或小角 X 射线散射数据对固有无序蛋白构象集合体进行表征。
J Chem Inf Model. 2019 May 28;59(5):1743-1758. doi: 10.1021/acs.jcim.8b00928. Epub 2019 Mar 18.
8
Transferable deep generative modeling of intrinsically disordered protein conformations.内在无序蛋白质构象的可转移深度生成建模
bioRxiv. 2024 Feb 8:2024.02.08.579522. doi: 10.1101/2024.02.08.579522.
9
Predicting Conformational Ensembles of Intrinsically Disordered Proteins: From Molecular Dynamics to Machine Learning.预测无序蛋白质的构象集合:从分子动力学到机器学习。
J Phys Chem Lett. 2024 Aug 15;15(32):8177-8186. doi: 10.1021/acs.jpclett.4c01544. Epub 2024 Aug 2.
10
Phanto-IDP: compact model for precise intrinsically disordered protein backbone generation and enhanced sampling.Phanto-IDP:用于精确生成无序蛋白质骨架和增强采样的紧凑模型。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad429.

引用本文的文献

1
Biomolecular Condensates as Emerging Biomaterials: Functional Mechanisms and Advances in Computational and Experimental Approaches.作为新兴生物材料的生物分子凝聚物:功能机制以及计算与实验方法的进展
Adv Mater. 2025 Sep;37(36):e10115. doi: 10.1002/adma.202510115. Epub 2025 Aug 13.
2
Drug resistance and tumor heterogeneity: cells and ensembles.耐药性与肿瘤异质性:细胞与细胞群体
Biophys Rev. 2025 May 31;17(3):759-779. doi: 10.1007/s12551-025-01320-y. eCollection 2025 Jun.
3
Mamba time series forecasting with uncertainty quantification.

本文引用的文献

1
Efficient sampling of high-dimensional free energy landscapes using adaptive reinforced dynamics.使用自适应强化动力学对高维自由能景观进行高效采样。
Nat Comput Sci. 2022 Jan;2(1):20-29. doi: 10.1038/s43588-021-00173-1. Epub 2021 Dec 24.
2
Modeling Concentration-dependent Phase Separation Processes Involving Peptides and RNA via Residue-Based Coarse-Graining.通过基于残基的粗粒化方法对涉及肽和RNA的浓度依赖性相分离过程进行建模。
J Chem Theory Comput. 2023 Jan 6. doi: 10.1021/acs.jctc.2c00856.
3
Conformational ensembles of intrinsically disordered proteins and flexible multidomain proteins.
具有不确定性量化的曼巴时间序列预测。
Mach Learn Sci Technol. 2025 Sep 30;6(3):035012. doi: 10.1088/2632-2153/adec3b. Epub 2025 Jul 22.
4
Flow perturbation to accelerate Boltzmann sampling.流动微扰以加速玻尔兹曼采样。
Nat Commun. 2025 Jul 17;16(1):6604. doi: 10.1038/s41467-025-62039-8.
5
Beyond static structures: protein dynamic conformations modeling in the post-AlphaFold era.超越静态结构:后AlphaFold时代的蛋白质动态构象建模
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf340.
6
Multimeric protein interaction and complex prediction: Structure, dynamics and function.多聚体蛋白质相互作用与复合物预测:结构、动力学与功能
Comput Struct Biotechnol J. 2025 May 16;27:1975-1997. doi: 10.1016/j.csbj.2025.05.009. eCollection 2025.
7
Targeting protein disorder: the next hurdle in drug discovery.靶向蛋白质无序状态:药物研发的下一个障碍。
Nat Rev Drug Discov. 2025 Jun 9. doi: 10.1038/s41573-025-01220-6.
8
for Investigating Conformational Transitions and Environmental Interactions of Proteins.用于研究蛋白质的构象转变和环境相互作用。
J Chem Theory Comput. 2025 May 27;21(10):5304-5321. doi: 10.1021/acs.jctc.5c00256. Epub 2025 May 13.
9
Physics-based modeling in the new era of enzyme engineering.酶工程新时代基于物理学的建模
Nat Comput Sci. 2025 Apr;5(4):279-291. doi: 10.1038/s43588-025-00788-8. Epub 2025 Apr 24.
10
Use of AI-methods over MD simulations in the sampling of conformational ensembles in IDPs.在内在无序蛋白质构象集合采样中,人工智能方法相较于分子动力学模拟的应用。
Front Mol Biosci. 2025 Apr 8;12:1542267. doi: 10.3389/fmolb.2025.1542267. eCollection 2025.
构象集合的无规卷曲蛋白质和灵活的多功能蛋白质。
Biochem Soc Trans. 2022 Feb 28;50(1):541-554. doi: 10.1042/BST20210499.
4
DisProt in 2022: improved quality and accessibility of protein intrinsic disorder annotation.2022 年的 DisProt:提高蛋白质固有无序注释的质量和可访问性。
Nucleic Acids Res. 2022 Jan 7;50(D1):D480-D487. doi: 10.1093/nar/gkab1082.
5
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models.深度生成模型:VAE、GAN、归一化流、基于能量和自回归模型的比较综述。
IEEE Trans Pattern Anal Mach Intell. 2022 Nov;44(11):7327-7347. doi: 10.1109/TPAMI.2021.3116668. Epub 2022 Oct 4.
6
A guide to machine learning for biologists.生物学机器学习指南。
Nat Rev Mol Cell Biol. 2022 Jan;23(1):40-55. doi: 10.1038/s41580-021-00407-0. Epub 2021 Sep 13.
7
AlphaFold and Implications for Intrinsically Disordered Proteins.AlphaFold 及其对无序蛋白质的影响。
J Mol Biol. 2021 Oct 1;433(20):167208. doi: 10.1016/j.jmb.2021.167208. Epub 2021 Aug 18.
8
Accurate prediction of protein structures and interactions using a three-track neural network.使用三轨神经网络准确预测蛋白质结构和相互作用。
Science. 2021 Aug 20;373(6557):871-876. doi: 10.1126/science.abj8754. Epub 2021 Jul 15.
9
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
10
High-accuracy protein structure prediction in CASP14.在 CASP14 中进行高精度蛋白质结构预测。
Proteins. 2021 Dec;89(12):1687-1699. doi: 10.1002/prot.26171. Epub 2021 Jul 14.