Suppr超能文献

基于物理的机器学习训练哈密顿量并解码无序蛋白质组中的序列构象关系。

Physics-Based Machine Learning Trains Hamiltonians and Decodes the Sequence-Conformation Relation in the Disordered Proteome.

机构信息

Department of Physics and Astronomy, University of Denver, Denver, Colorado 80210, United States.

Department of Molecular and Cellular Biophysics, University of Denver, Denver, Colorado 80210, United States.

出版信息

J Chem Theory Comput. 2024 Nov 26;20(22):10266-10274. doi: 10.1021/acs.jctc.4c01114. Epub 2024 Nov 6.

Abstract

Intrinsically disordered proteins and regions (IDPs) are involved in vital biological processes. To understand the IDP function, often controlled by conformation, we need to find the link between sequence and conformation. We decode this link by integrating theory, simulation, and machine learning (ML) where sequence-dependent electrostatics is modeled analytically while nonelectrostatic interaction is extracted from simulations for many sequences and subsequently trained using ML. The resulting Hamiltonian, combining physics-based electrostatics and machine-learned nonelectrostatics, accurately predicts sequence-specific global and local measures of conformations beyond the original observable used from the simulation. This is in contrast to traditional ML approaches that train and predict a specific observable, not a Hamiltonian. Our formalism reproduces experimental measurements, predicts multiple conformational features directly from sequence with high throughput that will give insights into IDP design and evolution, and illustrates the broad utility of using physics-based ML to train unknown parts of a Hamiltonian, rather than a specific observable, in combination with known physics.

摘要

无规卷曲蛋白质和区域(IDPs)参与重要的生物过程。为了理解 IDP 的功能,通常受构象控制,我们需要找到序列和构象之间的联系。我们通过整合理论、模拟和机器学习(ML)来解码这种联系,其中序列依赖性静电作用通过解析方法建模,而非静电相互作用从许多序列的模拟中提取,并随后使用 ML 进行训练。由此产生的哈密顿量,结合了基于物理的静电作用和基于机器学习的非静电作用,可以准确预测序列特异性的全局和局部构象度量,超出了原始模拟中使用的可观察值。这与传统的 ML 方法形成对比,传统的 ML 方法仅针对特定的可观察值进行训练和预测,而不是哈密顿量。我们的形式主义可以再现实验测量结果,直接从序列中预测多个构象特征,具有高通量,这将深入了解 IDP 的设计和进化,并说明了使用基于物理的 ML 来训练哈密顿量未知部分的广泛适用性,而不是特定的可观察值,结合已知的物理原理。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验