Bernsel Andreas, Viklund Håkan, Elofsson Arne
Center for Biomembrane Research, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden.
Proteins. 2008 May 15;71(3):1387-99. doi: 10.1002/prot.21825.
Compared with globular proteins, transmembrane proteins are surrounded by a more intricate environment and, consequently, amino acid composition varies between the different compartments. Existing algorithms for homology detection are generally developed with globular proteins in mind and may not be optimal to detect distant homology between transmembrane proteins. Here, we introduce a new profile-profile based alignment method for remote homology detection of transmembrane proteins in a hidden Markov model framework that takes advantage of the sequence constraints placed by the hydrophobic interior of the membrane. We expect that, for distant membrane protein homologs, even if the sequences have diverged too far to be recognized, the hydrophobicity pattern and the transmembrane topology are better conserved. By using this information in parallel with sequence information, we show that both sensitivity and specificity can be substantially improved for remote homology detection in two independent test sets. In addition, we show that alignment quality can be improved for the most distant homologs in a public dataset of membrane protein structures. Applying the method to the Pfam domain database, we are able to suggest new putative evolutionary relationships for a few relatively uncharacterized protein domain families, of which several are confirmed by other methods. The method is called Searcher for Homology Relationships of Integral Membrane Proteins (SHRIMP) and is available for download at http://www.sbc.su.se/shrimp/.
与球状蛋白相比,跨膜蛋白所处的环境更为复杂,因此,不同区域的氨基酸组成也有所不同。现有的同源性检测算法通常是基于球状蛋白开发的,可能并非检测跨膜蛋白之间远源同源性的最佳选择。在此,我们介绍一种新的基于轮廓-轮廓比对的方法,用于在隐马尔可夫模型框架下检测跨膜蛋白的远源同源性,该方法利用了膜疏水内部所施加的序列限制。我们预期,对于远源的膜蛋白同源物,即使序列差异过大而无法识别,其疏水性模式和跨膜拓扑结构也会得到更好的保守。通过将此信息与序列信息并行使用,我们表明在两个独立测试集中,远源同源性检测的灵敏度和特异性都能得到显著提高。此外,我们还表明,在膜蛋白结构的公共数据集中,对于最远源的同源物,比对质量也能得到改善。将该方法应用于Pfam结构域数据库,我们能够为一些相对未被充分表征的蛋白质结构域家族提出新的假定进化关系,其中一些已被其他方法所证实。该方法称为整合膜蛋白同源关系搜索器(SHRIMP),可从http://www.sbc.su.se/shrimp/下载。