Suppr超能文献

黎曼几何在蛋白质动力学数据分析中的高效应用。

Riemannian geometry for efficient analysis of protein dynamics data.

机构信息

Faculty of Mathematics, University of Cambridge, CB3 0WA Cambridge, United Kingdom.

Institute of Mathematics and Image Computing, University of Lübeck, 23562 Lübeck, Germany.

出版信息

Proc Natl Acad Sci U S A. 2024 Aug 13;121(33):e2318951121. doi: 10.1073/pnas.2318951121. Epub 2024 Aug 9.

Abstract

An increasingly common viewpoint is that protein dynamics datasets reside in a nonlinear subspace of low conformational energy. Ideal data analysis tools should therefore account for such nonlinear geometry. The Riemannian geometry setting can be suitable for a variety of reasons. First, it comes with a rich mathematical structure to account for a wide range of geometries that can be modeled after an energy landscape. Second, many standard data analysis tools developed for data in Euclidean space can be generalized to Riemannian manifolds. In the context of protein dynamics, a conceptual challenge comes from the lack of guidelines for constructing a smooth Riemannian structure based on an energy landscape. In addition, computational feasibility in computing geodesics and related mappings poses a major challenge. This work considers these challenges. The first part of the paper develops a local approximation technique for computing geodesics and related mappings on Riemannian manifolds in a computationally feasible manner. The second part constructs a smooth manifold and a Riemannian structure that is based on an energy landscape for protein conformations. The resulting Riemannian geometry is tested on several data analysis tasks relevant for protein dynamics data. In particular, the geodesics with given start- and end-points approximately recover corresponding molecular dynamics trajectories for proteins that undergo relatively ordered transitions with medium-sized deformations. The Riemannian protein geometry also gives physically realistic summary statistics and retrieves the underlying dimension even for large-sized deformations within seconds on a laptop.

摘要

一种越来越普遍的观点认为,蛋白质动力学数据集存在于低构象能的非线性子空间中。因此,理想的数据分析工具应该考虑到这种非线性几何形状。黎曼几何设置由于多种原因可能是合适的。首先,它具有丰富的数学结构,可以描述在能量景观之后建模的各种几何形状。其次,许多为欧几里得空间中的数据开发的标准数据分析工具可以推广到黎曼流形。在蛋白质动力学的背景下,一个概念上的挑战来自于缺乏基于能量景观构建平滑黎曼结构的指导方针。此外,计算测地线和相关映射的计算可行性也带来了重大挑战。这项工作考虑到了这些挑战。本文的第一部分开发了一种局部逼近技术,以便以计算可行的方式在黎曼流形上计算测地线和相关映射。第二部分构建了一个基于蛋白质构象能量景观的平滑流形和黎曼结构。所得的黎曼几何在几个与蛋白质动力学数据相关的数据分析任务上进行了测试。特别是,给定起点和终点的测地线可以近似恢复经历中等变形的相对有序转变的蛋白质的相应分子动力学轨迹。黎曼蛋白质几何还提供了物理上现实的汇总统计信息,并在笔记本电脑上几秒钟内即可检索到基础维度,即使是在大型变形的情况下。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbee/11331106/fe81da0013a5/pnas.2318951121fig01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验