利用基于预测的结构轮廓增强对蛋白质跨膜结构域的识别。
Enhanced recognition of protein transmembrane domains with prediction-based structural profiles.
作者信息
Cao Baoqiang, Porollo Aleksey, Adamczak Rafal, Jarrell Mark, Meller Jaroslaw
机构信息
Biomedical Informatics, Children's Hospital Research Foundation, Cincinnati, OH 45229, USA.
出版信息
Bioinformatics. 2006 Feb 1;22(3):303-9. doi: 10.1093/bioinformatics/bti784. Epub 2005 Nov 17.
MOTIVATION
Membrane domain prediction has recently been re-evaluated by several groups, suggesting that the accuracy of existing methods is still rather limited. In this work, we revisit this problem and propose novel methods for prediction of alpha-helical as well as beta-sheet transmembrane (TM) domains. The new approach is based on a compact representation of an amino acid residue and its environment, which consists of predicted solvent accessibility and secondary structure of each amino acid. A recently introduced method for solvent accessibility prediction trained on a set of soluble proteins is used here to indicate segments of residues that are predicted not to be accessible to water and, therefore, may be 'buried' in the membrane. While evolutionary profiles in the form of a multiple alignment are used to derive these simple 'structural profiles', they are not used explicitly for the membrane domain prediction and the overall number of parameters in the model is significantly reduced. This offers the possibility of a more reliable estimation of the free parameters in the model with a limited number of experimentally resolved membrane protein structures.
RESULTS
Using cross-validated training on available sets of structurally resolved and non-redundant alpha and beta membrane proteins, we demonstrate that membrane domain prediction methods based on such a compact representation outperform approaches that utilize explicitly evolutionary profiles and multiple alignments. Moreover, using an external evaluation by the TMH Benchmark server we show that our final prediction protocol for the TM helix prediction is competitive with the state-of-the-art methods, achieving per-residue accuracy of approximately 89% and per-segment accuracy of approximately 80% on the set of high resolution structures used by the TMH Benchmark server. At the same time the observed rates of confusion with signal peptides and globular proteins are the lowest among the tested methods. The new method is available online at http://minnou.cchmc.org.
动机
最近有几个研究小组对膜结构域预测进行了重新评估,结果表明现有方法的准确性仍然相当有限。在这项工作中,我们重新审视了这个问题,并提出了预测α螺旋和β折叠跨膜(TM)结构域的新方法。新方法基于氨基酸残基及其环境的紧凑表示,其中包括每个氨基酸的预测溶剂可及性和二级结构。这里使用一种最近引入的、基于一组可溶性蛋白质训练的溶剂可及性预测方法,来指示那些预测不可接触水的残基片段,因此这些片段可能“埋藏”在膜中。虽然以多序列比对形式的进化谱被用于推导这些简单的“结构谱”,但它们并未被明确用于膜结构域预测,并且模型中的参数总数显著减少。这使得在实验解析的膜蛋白结构数量有限的情况下,更可靠地估计模型中的自由参数成为可能。
结果
通过对可用的、结构解析且非冗余的α和β膜蛋白数据集进行交叉验证训练,我们证明基于这种紧凑表示的膜结构域预测方法优于那些明确利用进化谱和多序列比对的方法。此外,通过TMH基准服务器进行的外部评估,我们表明我们最终的TM螺旋预测方案与当前的先进方法具有竞争力,在TMH基准服务器使用的高分辨率结构集上,实现了每个残基约89%的准确率和每个片段约80%的准确率。同时,在测试方法中,观察到的与信号肽和球状蛋白的混淆率是最低的。新方法可在http://minnou.cchmc.org在线获取。