• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

UMAP 作为生物大分子分子动力学模拟的降维工具:一项对比研究。

UMAP as a Dimensionality Reduction Tool for Molecular Dynamics Simulations of Biomacromolecules: A Comparison Study.

机构信息

Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States.

Department of Statistical Science, Southern Methodist University, Dallas, Texas 75275, United States.

出版信息

J Phys Chem B. 2021 May 20;125(19):5022-5034. doi: 10.1021/acs.jpcb.1c02081. Epub 2021 May 11.

DOI:10.1021/acs.jpcb.1c02081
PMID:33973773
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8356557/
Abstract

Proteins are the molecular machines of life. The multitude of possible conformations that proteins can adopt determines their free-energy landscapes. However, the inherently high dimensionality of a protein free-energy landscape poses a challenge to deciphering how proteins perform their functions. For this reason, dimensionality reduction is an active field of research for molecular biologists. The uniform manifold approximation and projection (UMAP) is a dimensionality reduction method based on a fuzzy topological analysis of data. In the present study, the performance of UMAP is compared with that of other popular dimensionality reduction methods such as t-distributed stochastic neighbor embedding (t-SNE), principal component analysis (PCA), and time-structure independent components analysis (tICA) in the context of analyzing molecular dynamics simulations of the circadian clock protein VIVID. A good dimensionality reduction method should accurately represent the data structure on the projected components. The comparison of the raw high-dimensional data with the projections obtained using different dimensionality reduction methods based on various metrics showed that UMAP has superior performance when compared with linear reduction methods (PCA and tICA) and has competitive performance and scalable computational cost.

摘要

蛋白质是生命的分子机器。蛋白质可以采用的多种可能构象决定了它们的自由能景观。然而,蛋白质自由能景观固有的高维性给揭示蛋白质如何发挥其功能带来了挑战。出于这个原因,降维是分子生物学家的一个活跃研究领域。一致流形逼近和投影 (UMAP) 是一种基于数据模糊拓扑分析的降维方法。在本研究中,将 UMAP 的性能与其他流行的降维方法(如 t 分布随机邻居嵌入 (t-SNE)、主成分分析 (PCA) 和时间结构独立成分分析 (tICA))进行了比较,用于分析生物钟蛋白 VIVID 的分子动力学模拟。一个好的降维方法应该在投影分量上准确地表示数据结构。使用不同的降维方法基于各种度量对原始高维数据与投影的比较表明,与线性降维方法(PCA 和 tICA)相比,UMAP 具有更好的性能,并且具有竞争力的性能和可扩展的计算成本。

相似文献

1
UMAP as a Dimensionality Reduction Tool for Molecular Dynamics Simulations of Biomacromolecules: A Comparison Study.UMAP 作为生物大分子分子动力学模拟的降维工具:一项对比研究。
J Phys Chem B. 2021 May 20;125(19):5022-5034. doi: 10.1021/acs.jpcb.1c02081. Epub 2021 May 11.
2
Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data.UMAP 通过降维增强了批量转录组数据中样本异质性分析。
Cell Rep. 2021 Jul 27;36(4):109442. doi: 10.1016/j.celrep.2021.109442.
3
Evaluation of Distance Metrics and Spatial Autocorrelation in Uniform Manifold Approximation and Projection Applied to Mass Spectrometry Imaging Data.基于均摊近似和投影的距离度量和空间自相关评估及其在质谱成像数据中的应用。
Anal Chem. 2019 May 7;91(9):5706-5714. doi: 10.1021/acs.analchem.8b05827. Epub 2019 Apr 25.
4
Fuzzy Information Discrimination Measures and Their Application to Low Dimensional Embedding Construction in the UMAP Algorithm.模糊信息判别度量及其在UMAP算法中低维嵌入构建中的应用。
J Imaging. 2022 Apr 15;8(4):113. doi: 10.3390/jimaging8040113.
5
Protein folding intermediates on the dimensionality reduced landscape with UMAP and native contact likelihood.具有 UMAP 和天然接触可能性的降维景观上的蛋白质折叠中间体。
J Chem Phys. 2022 Aug 21;157(7):075101. doi: 10.1063/5.0099094.
6
Capturing discrete latent structures: choose LDs over PCs.捕捉离散潜在结构:选择潜在因子而非主成分。
Biostatistics. 2022 Dec 12;24(1):1-16. doi: 10.1093/biostatistics/kxab030.
7
A cross entropy test allows quantitative statistical comparison of t-SNE and UMAP representations.交叉熵测试允许对 t-SNE 和 UMAP 表示进行定量统计比较。
Cell Rep Methods. 2023 Jan 13;3(1):100390. doi: 10.1016/j.crmeth.2022.100390. eCollection 2023 Jan 23.
8
t-Distributed Stochastic Neighbor Embedding Method with the Least Information Loss for Macromolecular Simulations.用于大分子模拟的信息损失最小的 t 分布随机邻居嵌入方法。
J Chem Theory Comput. 2018 Nov 13;14(11):5499-5510. doi: 10.1021/acs.jctc.8b00652. Epub 2018 Oct 9.
9
ivis Dimensionality Reduction Framework for Biomacromolecular Simulations.用于生物大分子模拟的iVis降维框架。
J Chem Inf Model. 2020 Oct 26;60(10):4569-4581. doi: 10.1021/acs.jcim.0c00485. Epub 2020 Sep 1.
10
From High Dimensions to Human Insight: Exploring Dimensionality Reduction for Chemical Space Visualization.从高维到人类洞察:探索用于化学空间可视化的降维方法
Mol Inform. 2025 Jan;44(1):e202400265. doi: 10.1002/minf.202400265. Epub 2024 Dec 5.

引用本文的文献

1
A hybrid framework of generative deep learning for antiviral peptide discovery.用于抗病毒肽发现的生成式深度学习混合框架。
Sci Rep. 2025 Jul 15;15(1):25554. doi: 10.1038/s41598-025-11328-9.
2
Topic modeling-based prediction of software defects and root cause using BERTopic, and multioutput classifier.基于主题建模,使用BERTopic和多输出分类器对软件缺陷及根本原因进行预测。
Sci Rep. 2025 Jul 14;15(1):25428. doi: 10.1038/s41598-025-11458-0.
3
Linker-GPT: design of Antibody-drug conjugates linkers with molecular generators and reinforcement learning.

本文引用的文献

1
Unraveling the energetic significance of chemical events in enzyme catalysis via machine-learning based regression approach.通过基于机器学习的回归方法揭示酶催化中化学事件的能量学意义。
Commun Chem. 2020 Oct 8;3(1):134. doi: 10.1038/s42004-020-00379-w.
2
Deciphering the Allosteric Process of the Aureochrome 1a LOV Domain.解析 Aureochrome 1a LOV 结构域的别构过程。
J Phys Chem B. 2020 Oct 15;124(41):8960-8972. doi: 10.1021/acs.jpcb.0c05842. Epub 2020 Oct 1.
3
ivis Dimensionality Reduction Framework for Biomacromolecular Simulations.
连接子生成式预训练变换器(Linker-GPT):利用分子生成器和强化学习设计抗体药物偶联物连接子
Sci Rep. 2025 Jul 1;15(1):20525. doi: 10.1038/s41598-025-05555-3.
4
Robustness in biomolecular simulations: Addressing challenges in data generation, analysis, and curation.生物分子模拟中的稳健性:应对数据生成、分析和管理方面的挑战。
Cell Rep Phys Sci. 2025 May 21;6(5). doi: 10.1016/j.xcrp.2025.102566. Epub 2025 Apr 30.
5
Structural Plasticity and Functional Dynamics of Pigeon Cryptochrome 4 as Avian Magnetoreceptor.作为鸟类磁受体的鸽子隐花色素4的结构可塑性与功能动力学
J Mol Biol. 2025 May 27:169233. doi: 10.1016/j.jmb.2025.169233.
6
Revealing arginine-cysteine and glycine-cysteine NOS linkages by a systematic re-evaluation of protein structures.通过对蛋白质结构进行系统的重新评估来揭示精氨酸-半胱氨酸和甘氨酸-半胱氨酸一氧化氮合酶连接
Commun Chem. 2025 May 13;8(1):146. doi: 10.1038/s42004-025-01535-w.
7
Spatial mapping of the brain metabolome lipidome and glycome.大脑代谢组、脂质组和糖组的空间图谱。
Nat Commun. 2025 May 12;16(1):4373. doi: 10.1038/s41467-025-59487-7.
8
Extended Quality (eQual): Radial Threshold Clustering Based on -ary Similarity.扩展质量(eQual):基于 - 元相似度的径向阈值聚类
J Chem Inf Model. 2025 May 26;65(10):5062-5070. doi: 10.1021/acs.jcim.4c02341. Epub 2025 May 1.
9
Protecting your skin: a highly accurate LSTM network integrating conjoint features for predicting chemical-induced skin irritation.保护你的皮肤:一种集成联合特征的高精度长短期记忆网络,用于预测化学物质引起的皮肤刺激。
J Cheminform. 2025 Mar 27;17(1):39. doi: 10.1186/s13321-025-00980-y.
10
Molecular similarity: Theory, applications, and perspectives.分子相似性:理论、应用与展望。
Artif Intell Chem. 2024 Dec;2(2). doi: 10.1016/j.aichem.2024.100077. Epub 2024 Aug 31.
用于生物大分子模拟的iVis降维框架。
J Chem Inf Model. 2020 Oct 26;60(10):4569-4581. doi: 10.1021/acs.jcim.0c00485. Epub 2020 Sep 1.
4
Deciphering the protein motion of S1 subunit in SARS-CoV-2 spike glycoprotein through integrated computational methods.通过整合计算方法解析 SARS-CoV-2 刺突糖蛋白 S1 亚基的蛋白运动。
J Biomol Struct Dyn. 2021 Oct;39(17):6705-6712. doi: 10.1080/07391102.2020.1802338. Epub 2020 Aug 4.
5
UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts.UMAP 揭示了大型基因组队列中的隐藏种群结构和表型异质性。
PLoS Genet. 2019 Nov 1;15(11):e1008432. doi: 10.1371/journal.pgen.1008432. eCollection 2019 Nov.
6
A lineage-resolved molecular atlas of embryogenesis at single-cell resolution.单细胞分辨率解析胚胎发生的谱系分辨分子图谱。
Science. 2019 Sep 20;365(6459). doi: 10.1126/science.aax1971. Epub 2019 Sep 5.
7
Machine Learning Classification Model for Functional Binding Modes of TEM-1 β-Lactamase.用于TEM-1β-内酰胺酶功能结合模式的机器学习分类模型
Front Mol Biosci. 2019 Jul 9;6:47. doi: 10.3389/fmolb.2019.00047. eCollection 2019.
8
Using Dimensionality Reduction to Analyze Protein Trajectories.使用降维分析蛋白质轨迹。
Front Mol Biosci. 2019 Jun 19;6:46. doi: 10.3389/fmolb.2019.00046. eCollection 2019.
9
The single-cell transcriptional landscape of mammalian organogenesis.哺乳动物器官发生的单细胞转录组图谱。
Nature. 2019 Feb;566(7745):496-502. doi: 10.1038/s41586-019-0969-x. Epub 2019 Feb 20.
10
Allosteric mechanism of the circadian protein Vivid resolved through Markov state model and machine learning analysis.通过马尔可夫状态模型和机器学习分析解析生物钟蛋白 Vivid 的变构机制。
PLoS Comput Biol. 2019 Feb 19;15(2):e1006801. doi: 10.1371/journal.pcbi.1006801. eCollection 2019 Feb.