Suppr超能文献

无隐式序列比对的比较建模。

Comparative modeling without implicit sequence alignments.

作者信息

Kolinski Andrzej, Gront Dominik

机构信息

University of Warsaw, Faculty of Chemistry, Pasteura 1 02-093 Warsaw, Poland.

出版信息

Bioinformatics. 2007 Oct 1;23(19):2522-7. doi: 10.1093/bioinformatics/btm380. Epub 2007 Jul 27.

Abstract

MOTIVATION

The number of known protein sequences is about thousand times larger than the number of experimentally solved 3D structures. For more than half of the protein sequences a close or distant structural analog could be identified. The key starting point in a classical comparative modeling is to generate the best possible sequence alignment with a template or templates. With decreasing sequence similarity, the number of errors in the alignments increases and these errors are the main causes of the decreasing accuracy of the molecular models generated. Here we propose a new approach to comparative modeling, which does not require the implicit alignment - the model building phase explores geometric, evolutionary and physical properties of a template (or templates).

RESULTS

The proposed method requires prior identification of a template, although the initial sequence alignment is ignored. The model is built using a very efficient reduced representation search engine CABS to find the best possible superposition of the query protein onto the template represented as a 3D multi-featured scaffold. The criteria used include: sequence similarity, predicted secondary structure consistency, local geometric features and hydrophobicity profile. For more difficult cases, the new method qualitatively outperforms existing schemes of comparative modeling. The algorithm unifies de novo modeling, 3D threading and sequence-based methods. The main idea is general and could be easily combined with other efficient modeling tools as Rosetta, UNRES and others.

摘要

动机

已知蛋白质序列的数量比通过实验解析的3D结构的数量大约大一千倍。对于超过一半的蛋白质序列,可以识别出相近或较远的结构类似物。经典比较建模的关键起点是与一个或多个模板生成尽可能好的序列比对。随着序列相似性的降低,比对中的错误数量增加,而这些错误是所生成分子模型准确性降低的主要原因。在此,我们提出一种新的比较建模方法,该方法不需要隐式比对——模型构建阶段探索一个或多个模板的几何、进化和物理特性。

结果

所提出的方法需要事先识别一个模板,尽管初始序列比对被忽略。使用一个非常高效的简化表示搜索引擎CABS构建模型,以找到查询蛋白质与表示为3D多功能支架的模板的最佳可能叠加。所使用的标准包括:序列相似性、预测的二级结构一致性、局部几何特征和疏水性图谱。对于更困难的情况,新方法在质量上优于现有的比较建模方案。该算法统一了从头建模、3D穿线法和基于序列的方法。其主要思想具有普遍性,并且可以很容易地与其他高效的建模工具如Rosetta、UNRES等相结合。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验