Suppr超能文献

蛋白质折叠模拟的深度聚类。

Deep clustering of protein folding simulations.

机构信息

Computational Science and Engineering Division, Oak Ridge National Laboratory, One Bethel Valley Road, MS6085, Oak Ridge, TN, USA.

出版信息

BMC Bioinformatics. 2018 Dec 21;19(Suppl 18):484. doi: 10.1186/s12859-018-2507-5.

Abstract

BACKGROUND

We examine the problem of clustering biomolecular simulations using deep learning techniques. Since biomolecular simulation datasets are inherently high dimensional, it is often necessary to build low dimensional representations that can be used to extract quantitative insights into the atomistic mechanisms that underlie complex biological processes.

RESULTS

We use a convolutional variational autoencoder (CVAE) to learn low dimensional, biophysically relevant latent features from long time-scale protein folding simulations in an unsupervised manner. We demonstrate our approach on three model protein folding systems, namely Fs-peptide (14 μs aggregate sampling), villin head piece (single trajectory of 125 μs) and β- β- α (BBA) protein (223 + 102 μs sampling across two independent trajectories). In these systems, we show that the CVAE latent features learned correspond to distinct conformational substates along the protein folding pathways. The CVAE model predicts, on average, nearly 89% of all contacts within the folding trajectories correctly, while being able to extract folded, unfolded and potentially misfolded states in an unsupervised manner. Further, the CVAE model can be used to learn latent features of protein folding that can be applied to other independent trajectories, making it particularly attractive for identifying intrinsic features that correspond to conformational substates that share similar structural features.

CONCLUSIONS

Together, we show that the CVAE model can quantitatively describe complex biophysical processes such as protein folding.

摘要

背景

我们研究了使用深度学习技术对生物分子模拟进行聚类的问题。由于生物分子模拟数据集本质上是高维的,因此通常需要构建低维表示,以便从原子机制中提取定量见解,这些机制是复杂生物过程的基础。

结果

我们使用卷积变分自动编码器 (CVAE) 以无监督的方式从长时间尺度的蛋白质折叠模拟中学习低维、具有生物物理意义的潜在特征。我们在三个模型蛋白质折叠系统上证明了我们的方法,即 Fs-肽(14 μs 聚集采样)、绒毛蛋白头部片段(125 μs 的单个轨迹)和 β-β-α(BBA)蛋白(跨越两条独立轨迹的 223 + 102 μs 采样)。在这些系统中,我们表明,CVAE 学习到的潜在特征对应于蛋白质折叠途径中的不同构象亚状态。CVAE 模型平均预测折叠轨迹内近 89%的所有接触正确,同时能够以无监督的方式提取折叠、未折叠和潜在错误折叠状态。此外,CVAE 模型可用于学习可应用于其他独立轨迹的蛋白质折叠的潜在特征,使其特别适合识别与具有相似结构特征的构象亚状态相对应的内在特征。

结论

总的来说,我们表明 CVAE 模型可以定量描述蛋白质折叠等复杂的生物物理过程。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a9ad/6302667/0b39e0e6e6d5/12859_2018_2507_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验