Suppr超能文献

最优的同时叠加多个具有缺失数据的结构。

Optimal simultaneous superpositioning of multiple structures with missing data.

机构信息

Department of Biochemistry, Brandeis University, MS009, Waltham, MA 02454, USA.

出版信息

Bioinformatics. 2012 Aug 1;28(15):1972-9. doi: 10.1093/bioinformatics/bts243. Epub 2012 Apr 27.

Abstract

MOTIVATION

Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the different structures. However, in practice, some points are usually 'missing' from several structures, for example, when the alignment contains gaps. Current superposition methods deal with missing data simply by superpositioning a subset of points that are shared among all the structures. This practice is inefficient, as it ignores important data, and it fails to satisfy the common least-squares criterion. In the extreme, disregarding missing positions prohibits the calculation of a superposition altogether.

RESULTS

Here, we present a general solution for determining an optimal superposition when some of the data are missing. We use the expectation-maximization algorithm, a classic statistical technique for dealing with incomplete data, to find both maximum-likelihood solutions and the optimal least-squares solution as a special case.

AVAILABILITY AND IMPLEMENTATION

The methods presented here are implemented in THESEUS 2.0, a program for superpositioning macromolecular structures. ANSI C source code and selected compiled binaries for various computing platforms are freely available under the GNU open source license from http://www.theseus3d.org.

CONTACT

dtheobald@brandeis.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

叠加是结构生物学中的一项基本技术,它有助于比较和分析拓扑相似结构之间的构象差异。执行叠加需要在不同结构中的点集之间建立一一对应的关系,即对齐。然而,在实践中,一些点通常会从几个结构中“缺失”,例如,当对齐包含间隙时。当前的叠加方法简单地通过叠加所有结构共有的点的子集来处理缺失数据。这种做法效率低下,因为它忽略了重要的数据,并且不符合常见的最小二乘准则。在极端情况下,忽略缺失的位置会完全禁止进行叠加计算。

结果

在这里,我们提出了一种当部分数据缺失时确定最佳叠加的通用解决方案。我们使用期望最大化算法,这是一种用于处理不完整数据的经典统计技术,以找到最大似然解和最优最小二乘解作为特例。

可用性和实现

这里介绍的方法已在 THESEUS 2.0 中实现,这是一个用于大分子结构叠加的程序。ANSI C 源代码和各种计算平台的选定编译二进制文件可根据 GNU 开源许可证从 http://www.theseus3d.org 免费获得。

联系方式

dtheobald@brandeis.edu

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0c8/3400950/849193c6fc5e/bts243f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验