• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过扩展相似性度量来改进生物集合体的分析。

Improving the analysis of biological ensembles through extended similarity measures.

机构信息

Department of Chemistry, University of Florida, Gainesville, FL, 32611, USA.

Quantum Theory Project, University of Florida, Gainesville, FL, 32611, USA.

出版信息

Phys Chem Chem Phys. 2021 Dec 22;24(1):444-451. doi: 10.1039/d1cp04019g.

DOI:10.1039/d1cp04019g
PMID:34897334
Abstract

We present new algorithms to classify structural ensembles of macromolecules based on the recently proposed extended similarity measures. Molecular dynamics provides a wealth of structural information on systems of biological interest. As computer power increases, we capture larger ensembles and larger conformational transitions between states. Typically, structural clustering provides the statistical mechanics treatment of the system to identify relevant biological states. The key advantage of our approach is that the newly introduced extended similarity indices reduce the computational complexity of assessing the similarity of a set of structures from O() to O(). Here we take advantage of this favorable cost to develop several highly efficient techniques, including a linear-scaling algorithm to determine the medoid of a set (which we effectively use to select the most representative structure of a cluster). Moreover, we use our extended similarity indices as a linkage criterion in a novel hierarchical agglomerative clustering algorithm. We apply these new metrics to analyze the ensembles of several systems of biological interest such as folding and binding of macromolecules (peptide, protein, DNA-protein). In particular, we design a new workflow that is capable of identifying the most important conformations contributing to the protein folding process. We show excellent performance in the resulting clusters (surpassing traditional linkage criteria), along with faster performance and an efficient cost-function to identify when to merge clusters.

摘要

我们提出了新的算法,基于最近提出的扩展相似性度量来对大分子的结构集合进行分类。分子动力学为生物感兴趣的系统提供了丰富的结构信息。随着计算机能力的提高,我们捕获了更大的集合和更大的状态之间的构象转变。通常,结构聚类为系统提供统计力学处理,以识别相关的生物状态。我们方法的关键优势在于,新引入的扩展相似性指数将评估一组结构的相似性的计算复杂度从 O()降低到 O()。在这里,我们利用这一有利的成本优势开发了几种高效技术,包括一种线性标度算法来确定集合的中位数(我们有效地利用它来选择聚类中最具代表性的结构)。此外,我们还将扩展相似性指数用作新的层次凝聚聚类算法中的链接标准。我们将这些新的度量标准应用于分析几个生物感兴趣的系统的集合,如大分子(肽、蛋白质、DNA-蛋白质)的折叠和结合。特别是,我们设计了一种新的工作流程,能够识别对蛋白质折叠过程有重要贡献的最主要构象。我们在得到的聚类中表现出优异的性能(超过传统的链接标准),同时具有更快的性能和有效的成本函数,可以确定何时合并聚类。

相似文献

1
Improving the analysis of biological ensembles through extended similarity measures.通过扩展相似性度量来改进生物集合体的分析。
Phys Chem Chem Phys. 2021 Dec 22;24(1):444-451. doi: 10.1039/d1cp04019g.
2
Protein Retrieval via Integrative Molecular Ensembles (PRIME) through Extended Similarity Indices.通过扩展相似性指数的综合分子组合(PRIME)进行蛋白质提取。
J Chem Theory Comput. 2024 Jul 23;20(14):6303-6315. doi: 10.1021/acs.jctc.4c00362. Epub 2024 Jul 8.
3
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象:化学与物理邂逅生物学(瑞士阿斯科纳,2012年6月10日至14日)
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
4
Molecular Dynamics Simulations and Diversity Selection by Extended Continuous Similarity Indices.分子动力学模拟与通过扩展连续相似性指数进行的多样性选择。
J Chem Inf Model. 2022 Jul 25;62(14):3415-3425. doi: 10.1021/acs.jcim.2c00433. Epub 2022 Jul 14.
5
Conformational and functional analysis of molecular dynamics trajectories by self-organising maps.基于自组织映射的分子动力学轨迹的构象和功能分析。
BMC Bioinformatics. 2011 May 14;12:158. doi: 10.1186/1471-2105-12-158.
6
Extended similarity indices: the benefits of comparing more than two objects simultaneously. Part 2: speed, consistency, diversity selection.扩展相似性指数:同时比较两个以上对象的益处。第2部分:速度、一致性、多样性选择。
J Cheminform. 2021 Apr 23;13(1):33. doi: 10.1186/s13321-021-00504-4.
7
Cross-over between discrete and continuous protein structure space: insights into automatic classification and networks of protein structures.离散与连续蛋白质结构空间之间的交叉:对蛋白质结构自动分类及网络的见解。
PLoS Comput Biol. 2009 Mar;5(3):e1000331. doi: 10.1371/journal.pcbi.1000331. Epub 2009 Mar 27.
8
Artificial neural networks for efficient clustering of conformational ensembles and their potential for medicinal chemistry.人工神经网络在构象系综高效聚类中的应用及其在药物化学中的潜力。
Curr Top Med Chem. 2013;13(5):642-51. doi: 10.2174/1568026611313050007.
9
Eurecon: Equidistant uniform rigid-body ensemble constructor.Eurecon:等距均匀刚体集合构造器。
J Mol Graph Model. 2018 Mar;80:313-319. doi: 10.1016/j.jmgm.2018.01.015. Epub 2018 Feb 2.
10
An Effective Approach for Clustering InhA Molecular Dynamics Trajectory Using Substrate-Binding Cavity Features.一种利用底物结合腔特征对InhA分子动力学轨迹进行聚类的有效方法。
PLoS One. 2015 Jul 28;10(7):e0133172. doi: 10.1371/journal.pone.0133172. eCollection 2015.

引用本文的文献

1
Undersampling techniques for non-linear chemical space visualization.用于非线性化学空间可视化的欠采样技术。
bioRxiv. 2025 Jul 7:2025.07.03.663077. doi: 10.1101/2025.07.03.663077.
2
Scaling -Means for Multi-Million Frames: A Stratified NANI Approach for Large-Scale MD Simulations.数百万帧的缩放方法:一种用于大规模分子动力学模拟的分层非自适应邻居搜索方法
bioRxiv. 2025 Jun 18:2025.06.15.659780. doi: 10.1101/2025.06.15.659780.
3
SHINE: Deterministic Many-to-Many Clustering of Molecular Pathways.SHINE:分子通路的确定性多对多聚类
J Chem Inf Model. 2025 May 26;65(10):4775-4782. doi: 10.1021/acs.jcim.5c00240. Epub 2025 May 6.
4
Extended Quality (eQual): Radial Threshold Clustering Based on -ary Similarity.扩展质量(eQual):基于 - 元相似度的径向阈值聚类
J Chem Inf Model. 2025 May 26;65(10):5062-5070. doi: 10.1021/acs.jcim.4c02341. Epub 2025 May 1.
5
Hierarchical Extended Linkage Method (HELM)'s Deep Dive into Hybrid Clustering Strategies.分层扩展链接方法(HELM)对混合聚类策略的深入研究。
bioRxiv. 2025 Mar 10:2025.03.05.641742. doi: 10.1101/2025.03.05.641742.
6
Molecular similarity: Theory, applications, and perspectives.分子相似性:理论、应用与展望。
Artif Intell Chem. 2024 Dec;2(2). doi: 10.1016/j.aichem.2024.100077. Epub 2024 Aug 31.
7
BitBIRCH: efficient clustering of large molecular libraries.BitBIRCH:大型分子文库的高效聚类
Digit Discov. 2025 Mar 13;4(4):1042-1051. doi: 10.1039/d5dd00030k. eCollection 2025 Apr 9.
8
SHINE: Deterministic Many-to-Many clustering of Molecular Pathways.SHINE:分子通路的确定性多对多聚类
bioRxiv. 2025 Feb 8:2025.02.07.636541. doi: 10.1101/2025.02.07.636541.
9
Extended Quality (eQual): Radial threshold clustering based on n-ary similarity.扩展质量(eQual):基于n元相似度的径向阈值聚类
bioRxiv. 2024 Dec 5:2024.12.05.627001. doi: 10.1101/2024.12.05.627001.
10
Extended Activity Cliffs-Driven Approaches on Data Splitting for the Study of Bioactivity Machine Learning Predictions.用于生物活性机器学习预测研究的数据拆分的扩展活动悬崖驱动方法。
Mol Inform. 2025 Jan;44(1):e202400054. doi: 10.1002/minf.202400054. Epub 2024 Nov 18.