Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France.
Bioinformatics. 2015 Dec 1;31(23):3782-9. doi: 10.1093/bioinformatics/btv462. Epub 2015 Aug 7.
Template-based modeling, the most successful approach for predicting protein 3D structure, often requires detecting distant evolutionary relationships between the target sequence and proteins of known structure. Developed for this purpose, fold recognition methods use elaborate strategies to exploit evolutionary information, mainly by encoding amino acid sequence into profiles. Since protein structure is more conserved than sequence, the inclusion of structural information can improve the detection of remote homology.
Here, we present ORION, a new fold recognition method based on the pairwise comparison of hybrid profiles that contain evolutionary information from both protein sequence and structure. Our method uses the 16-state structural alphabet Protein Blocks, which provides an accurate 1D description of protein structure local conformations. ORION systematically outperforms PSI-BLAST and HHsearch on several benchmarks, including target sequences from the modeling competitions CASP8, 9 and 10, and detects ∼10% more templates at fold and superfamily SCOP levels.
Software freely available for download at http://www.dsimb.inserm.fr/orion/.
jean-christophe.gelly@univ-paris-diderot.fr.
Supplementary data are available at Bioinformatics online.
基于模板的建模是预测蛋白质 3D 结构最成功的方法,它通常需要检测目标序列与已知结构蛋白质之间的遥远进化关系。为此目的开发的折叠识别方法使用精心设计的策略来利用进化信息,主要是通过将氨基酸序列编码为轮廓。由于蛋白质结构比序列更保守,因此包含结构信息可以提高远程同源性的检测。
在这里,我们提出了 ORION,这是一种新的基于混合轮廓的折叠识别方法,该方法包含来自蛋白质序列和结构的进化信息。我们的方法使用了 16 状态结构字母蛋白块,它提供了蛋白质结构局部构象的准确一维描述。ORION 在几个基准测试中系统地优于 PSI-BLAST 和 HHsearch,包括来自建模竞赛 CASP8、9 和 10 的目标序列,并在折叠和超家族 SCOP 级别检测到约 10%更多的模板。
软件可在 http://www.dsimb.inserm.fr/orion/ 免费下载。
jean-christophe.gelly@univ-paris-diderot.fr。
补充数据可在生物信息学在线获得。