Suppr超能文献

蛋白质的多重结构比对和共识识别。

Multiple structure alignment and consensus identification for proteins.

机构信息

Department of Computer Science, Gettysburg College, Gettysburg, PA, USA.

出版信息

BMC Bioinformatics. 2010 Feb 2;11:71. doi: 10.1186/1471-2105-11-71.

Abstract

BACKGROUND

An algorithm is presented to compute a multiple structure alignment for a set of proteins and to generate a consensus (pseudo) protein which captures common substructures present in the given proteins. The algorithm represents each protein as a sequence of triples of coordinates of the alpha-carbon atoms along the backbone. It then computes iteratively a sequence of transformation matrices (i.e., translations and rotations) to align the proteins in space and generate the consensus. The algorithm is a heuristic in that it computes an approximation to the optimal alignment that minimizes the sum of the pairwise distances between the consensus and the transformed proteins.

RESULTS

Experimental results show that the algorithm converges quite rapidly and generates consensus structures that are visually similar to the input proteins. A comparison with other coordinate-based alignment algorithms (MAMMOTH and MATT) shows that the proposed algorithm is competitive in terms of speed and the sizes of the conserved regions discovered in an extensive benchmark dataset derived from the HOMSTRAD and SABmark databases. The algorithm has been implemented in C++ and can be downloaded from the project's web page. Alternatively, the algorithm can be used via a web server which makes it possible to align protein structures by uploading files from local disk or by downloading protein data from the RCSB Protein Data Bank.

CONCLUSIONS

An algorithm is presented to compute a multiple structure alignment for a set of proteins, together with their consensus structure. Experimental results show its effectiveness in terms of the quality of the alignment and computational cost.

摘要

背景

本文提出了一种算法,用于计算一组蛋白质的多重结构比对,并生成一个共识(伪)蛋白质,该蛋白质捕捉到给定蛋白质中存在的常见子结构。该算法将每个蛋白质表示为沿骨架的α-碳原子坐标的三重序列。然后,它迭代计算一系列变换矩阵(即平移和旋转),以在空间中对齐蛋白质并生成共识。该算法是一种启发式算法,它计算出最佳对齐的近似值,该值最小化共识与变换后的蛋白质之间的成对距离的总和。

结果

实验结果表明,该算法收敛速度相当快,生成的共识结构与输入蛋白质在视觉上相似。与其他基于坐标的对齐算法(MAMMOTH 和 MATT)的比较表明,该算法在速度和在从 HOMSTRAD 和 SABmark 数据库派生的广泛基准数据集发现的保守区域的大小方面具有竞争力。该算法已用 C++实现,并可从项目网页下载。或者,可以通过一个 Web 服务器使用该算法,该服务器允许通过从本地磁盘上传文件或从 RCSB 蛋白质数据库下载蛋白质数据来对齐蛋白质结构。

结论

本文提出了一种算法,用于计算一组蛋白质及其共识结构的多重结构比对。实验结果表明,该算法在对齐质量和计算成本方面都具有有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4e2/2829528/bcdf39303c68/1471-2105-11-71-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验