Ahmed Rezwan, Rangwala Huzefa, Karypis George
Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota 55455, USA.
J Bioinform Comput Biol. 2010 Feb;8(1):39-57. doi: 10.1142/s0219720010004501.
Alpha-helical transmembrane proteins mediate many key biological processes and represent 20%-30% of all genes in many organisms. Due to the difficulties in experimentally determining their high-resolution 3D structure, computational methods to predict the location and orientation of transmembrane helix segments using sequence information are essential. We present TOPTMH, a new transmembrane helix topology prediction method that combines support vector machines, hidden Markov models, and a widely used rule-based scheme. The contribution of this work is the development of a prediction approach that first uses a binary SVM classifier to predict the helix residues and then it employs a pair of HMM models that incorporate the SVM predictions and hydropathy-based features to identify the entire transmembrane helix segments by capturing the structural characteristics of these proteins. TOPTMH outperforms state-of-the-art prediction methods and achieves the best performance on an independent static benchmark.
α-螺旋跨膜蛋白介导许多关键的生物学过程,在许多生物体中占所有基因的20%-30%。由于通过实验确定其高分辨率三维结构存在困难,利用序列信息预测跨膜螺旋段的位置和方向的计算方法至关重要。我们提出了TOPTMH,一种新的跨膜螺旋拓扑预测方法,它结合了支持向量机、隐马尔可夫模型和一种广泛使用的基于规则的方案。这项工作的贡献在于开发了一种预测方法,该方法首先使用二元支持向量机分类器预测螺旋残基,然后采用一对隐马尔可夫模型,将支持向量机预测和基于亲水性的特征结合起来,通过捕捉这些蛋白质的结构特征来识别整个跨膜螺旋段。TOPTMH优于现有的预测方法,并在独立的静态基准测试中取得了最佳性能。