最优的同时叠加多个具有缺失数据的结构。

Optimal simultaneous superpositioning of multiple structures with missing data.

机构信息

Department of Biochemistry, Brandeis University, MS009, Waltham, MA 02454, USA.

出版信息

Bioinformatics. 2012 Aug 1;28(15):1972-9. doi: 10.1093/bioinformatics/bts243. Epub 2012 Apr 27.

DOI:10.1093/bioinformatics/bts243

PMID:22543369

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3400950/

Abstract

MOTIVATION

Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the different structures. However, in practice, some points are usually 'missing' from several structures, for example, when the alignment contains gaps. Current superposition methods deal with missing data simply by superpositioning a subset of points that are shared among all the structures. This practice is inefficient, as it ignores important data, and it fails to satisfy the common least-squares criterion. In the extreme, disregarding missing positions prohibits the calculation of a superposition altogether.

RESULTS

Here, we present a general solution for determining an optimal superposition when some of the data are missing. We use the expectation-maximization algorithm, a classic statistical technique for dealing with incomplete data, to find both maximum-likelihood solutions and the optimal least-squares solution as a special case.

AVAILABILITY AND IMPLEMENTATION

The methods presented here are implemented in THESEUS 2.0, a program for superpositioning macromolecular structures. ANSI C source code and selected compiled binaries for various computing platforms are freely available under the GNU open source license from http://www.theseus3d.org.

CONTACT

dtheobald@brandeis.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

叠加是结构生物学中的一项基本技术，它有助于比较和分析拓扑相似结构之间的构象差异。执行叠加需要在不同结构中的点集之间建立一一对应的关系，即对齐。然而，在实践中，一些点通常会从几个结构中“缺失”，例如，当对齐包含间隙时。当前的叠加方法简单地通过叠加所有结构共有的点的子集来处理缺失数据。这种做法效率低下，因为它忽略了重要的数据，并且不符合常见的最小二乘准则。在极端情况下，忽略缺失的位置会完全禁止进行叠加计算。

结果

在这里，我们提出了一种当部分数据缺失时确定最佳叠加的通用解决方案。我们使用期望最大化算法，这是一种用于处理不完整数据的经典统计技术，以找到最大似然解和最优最小二乘解作为特例。

可用性和实现

这里介绍的方法已在 THESEUS 2.0 中实现，这是一个用于大分子结构叠加的程序。ANSI C 源代码和各种计算平台的选定编译二进制文件可根据 GNU 开源许可证从 http://www.theseus3d.org 免费获得。

联系方式

dtheobald@brandeis.edu

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0c8/3400950/849193c6fc5e/bts243f1.jpg

相似文献

Optimal simultaneous superpositioning of multiple structures with missing data.最优的同时叠加多个具有缺失数据的结构。

Bioinformatics. 2012 Aug 1;28(15):1972-9. doi: 10.1093/bioinformatics/bts243. Epub 2012 Apr 27.

THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures.忒修斯：大分子结构的最大似然叠加与分析。

Bioinformatics. 2006 Sep 1;22(17):2171-2. doi: 10.1093/bioinformatics/btl332. Epub 2006 Jun 15.

An effective sequence-alignment-free superpositioning of pairwise or multiple structures with missing data.一种有效的、无需序列比对的成对或多个带有缺失数据的结构的叠加方法。

Algorithms Mol Biol. 2016 Jun 21;11:18. doi: 10.1186/s13015-016-0079-3. eCollection 2016.

Protein structure comparisons using a combination of a genetic algorithm, dynamic programming and least-squares minimization.结合遗传算法、动态规划和最小二乘法最小化进行蛋白质结构比较。

Protein Eng. 1994 Apr;7(4):475-85. doi: 10.1093/protein/7.4.475.

ProtTest 3: fast selection of best-fit models of protein evolution.ProtTest 3：快速选择最佳蛋白质进化模型。

Bioinformatics. 2011 Apr 15;27(8):1164-5. doi: 10.1093/bioinformatics/btr088. Epub 2011 Feb 17.

Structural RNA alignment by multi-objective optimization.基于多目标优化的结构 RNA 比对。

Bioinformatics. 2013 Jul 1;29(13):1607-13. doi: 10.1093/bioinformatics/btt188. Epub 2013 Apr 24.

On sufficient statistics of least-squares superposition of vector sets.关于向量集最小二乘叠加的充分统计量。

J Comput Biol. 2015 Jun;22(6):487-97. doi: 10.1089/cmb.2014.0154. Epub 2015 Feb 19.

BitPAl: a bit-parallel, general integer-scoring sequence alignment algorithm.BitPAl：一种位并行、通用的整数评分序列比对算法。

Bioinformatics. 2014 Nov 15;30(22):3166-73. doi: 10.1093/bioinformatics/btu507. Epub 2014 Jul 29.

Convergent algorithms for protein structural alignment.用于蛋白质结构比对的收敛算法。

BMC Bioinformatics. 2007 Aug 22;8:306. doi: 10.1186/1471-2105-8-306.

AL2CO: calculation of positional conservation in a protein sequence alignment.AL2CO：蛋白质序列比对中位置保守性的计算

Bioinformatics. 2001 Aug;17(8):700-12. doi: 10.1093/bioinformatics/17.8.700.

引用本文的文献

Explaining Conformational Diversity in Protein Families through Molecular Motions.通过分子运动解释蛋白质家族中的构象多样性。

Sci Data. 2024 Jul 10;11(1):752. doi: 10.1038/s41597-024-03524-5.

Structure prediction of linear and cyclic peptides using CABS-flex.使用 CABS-flex 进行线性和环状肽的结构预测。

Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae003.

Statistical learning of protein elastic network from positional covariance matrix.基于位置协方差矩阵的蛋白质弹性网络的统计学习

Comput Struct Biotechnol J. 2023 Mar 28;21:2524-2535. doi: 10.1016/j.csbj.2023.03.033. eCollection 2023.

Structural basis of a two-step tRNA recognition mechanism for plastid glycyl-tRNA synthetase.质体丙氨酰-tRNA 合成酶两步 tRNA 识别机制的结构基础。

Nucleic Acids Res. 2023 May 8;51(8):4000-4011. doi: 10.1093/nar/gkad144.

Structural and functional mapping of gene cluster in DR1.DR1中基因簇的结构与功能图谱

Comput Struct Biotechnol J. 2022 Dec 11;21:519-534. doi: 10.1016/j.csbj.2022.12.015. eCollection 2023.

Structure and Dynamics of Human Chemokine CCL16-Implications for Biological Activity.人类趋化因子 CCL16 的结构与动力学：对生物学活性的影响。

Biomolecules. 2022 Oct 28;12(11):1588. doi: 10.3390/biom12111588.

Proton coupling and the multiscale kinetic mechanism of a peptide transporter.质子偶联与肽转运蛋白的多尺度动力学机制。

Biophys J. 2022 Jun 21;121(12):2266-2278. doi: 10.1016/j.bpj.2022.05.029. Epub 2022 May 25.

Molecular insights into RNA recognition and gene regulation by the TRIM-NHL protein Mei-P26.TRIM-NHL 蛋白 Mei-P26 对 RNA 的识别和基因调控的分子见解。

Life Sci Alliance. 2022 May 5;5(8). doi: 10.26508/lsa.202201418. Print 2022 Aug.

Wavelet invariants for statistically robust multi-reference alignment.用于统计稳健多参考对齐的小波不变量。

Inf inference. 2021 Dec;10(4):1287-1351. doi: 10.1093/imaiai/iaaa016. Epub 2020 Aug 13.

A Probabilistic Programming Approach to Protein Structure Superposition.一种用于蛋白质结构叠加的概率编程方法。

Proc IEEE Symp Comput Intell Bioinforma Comput Biol. 2019 Jul;2019. doi: 10.1109/cibcb.2019.8791469. Epub 2019 Aug 8.

本文引用的文献

Accurate structural correlations from maximum likelihood superpositions.基于最大似然叠加的精确结构相关性。

PLoS Comput Biol. 2008 Feb;4(2):e43. doi: 10.1371/journal.pcbi.0040043.

Matt: local flexibility aids protein multiple structure alignment.马特：局部灵活性有助于蛋白质多结构比对。

PLoS Comput Biol. 2008 Jan;4(1):e10. doi: 10.1371/journal.pcbi.0040010.

Vorolign--fast structural alignment using Voronoi contacts.Vorolign——使用Voronoi接触进行快速结构比对。

Bioinformatics. 2007 Jan 15;23(2):e205-11. doi: 10.1093/bioinformatics/btl294.

Empirical Bayes hierarchical models for regularizing maximum likelihood estimation in the matrix Gaussian Procrustes problem.用于在矩阵高斯普罗克汝斯忒斯问题中对最大似然估计进行正则化的经验贝叶斯层次模型。

Proc Natl Acad Sci U S A. 2006 Dec 5;103(49):18521-7. doi: 10.1073/pnas.0508445103. Epub 2006 Nov 27.

THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures.忒修斯：大分子结构的最大似然叠加与分析。

Bioinformatics. 2006 Sep 1;22(17):2171-2. doi: 10.1093/bioinformatics/btl332. Epub 2006 Jun 15.

MUSTANG: a multiple structural alignment algorithm.MUSTANG：一种多重结构比对算法。

Proteins. 2006 Aug 15;64(3):559-74. doi: 10.1002/prot.20921.

Comparing programs for rigid-body multiple structural superposition of proteins.比较蛋白质刚体多结构叠加的程序。

Proteins. 2006 Jul 1;64(1):219-26. doi: 10.1002/prot.20975.

Multiple flexible structure alignment using partial order graphs.使用偏序图的多柔性结构比对

Bioinformatics. 2005 May 15;21(10):2362-9. doi: 10.1093/bioinformatics/bti353. Epub 2005 Mar 3.

SuperPose: a simple server for sophisticated structural superposition.SuperPose：一款用于复杂结构叠加的简易服务器。

Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W590-4. doi: 10.1093/nar/gkh477.

A method for simultaneous alignment of multiple protein structures.一种用于同时比对多个蛋白质结构的方法。

Proteins. 2004 Jul 1;56(1):143-56. doi: 10.1002/prot.10628.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

最优的同时叠加多个具有缺失数据的结构。

Optimal simultaneous superpositioning of multiple structures with missing data.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

CONTACT

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

联系方式

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献