Runthala Ashish, Chowdhury Shibasish
1 Department of Biological Sciences, Birla Institute of Technology and Science, Pilani-333031, India.
J Bioinform Comput Biol. 2019 Apr;17(2):1950006. doi: 10.1142/S0219720019500069.
In contrast to protein modeling methodologies, comparative modeling is considered as the most popular and reliable algorithm to model protein structure. However, the selection of the best set of templates is still a major challenge. An effective template-ranking algorithm is developed to efficiently select only the reliable hits for predicting the protein structures. The algorithm employs the pairwise as well as multiple sequence alignments of template hits to rank and select the best possible set of templates. It captures several key sequences and structural information of template hits and converts into scores to effectively rank them. This selected set of templates is used to model a target. Modeling accuracy of the algorithm is tested and evaluated on TBM-HA domain containing CASP8, CASP9 and CASP10 targets. On an average, this template ranking and selection algorithm improves GDT-TS, GDT-HA and TM_Score by 3.531, 4.814 and 0.022, respectively. Further, it has been shown that the inclusion of structurally similar templates with ample conformational diversity is crucial for the modeling algorithm to maximally as well as reliably span the target sequence and construct its near-native model. The optimal model sampling also holds the key to predict the best possible target structure.
与蛋白质建模方法不同,比较建模被认为是用于蛋白质结构建模的最流行且可靠的算法。然而,选择最佳的模板集仍然是一个重大挑战。开发了一种有效的模板排序算法,以仅高效地选择用于预测蛋白质结构的可靠匹配项。该算法利用模板匹配项的两两比对以及多序列比对来对最佳可能的模板集进行排序和选择。它捕获模板匹配项的几个关键序列和结构信息,并将其转换为分数以有效地对它们进行排序。这一选定的模板集用于对目标进行建模。该算法的建模准确性在包含CASP8、CASP9和CASP10目标的TBM-HA结构域上进行了测试和评估。平均而言,这种模板排序和选择算法分别将GDT-TS、GDT-HA和TM_Score提高了3.531、4.814和0.022。此外,研究表明,纳入具有足够构象多样性的结构相似模板对于建模算法最大程度且可靠地覆盖目标序列并构建其近天然模型至关重要。最佳模型采样也是预测最佳可能目标结构的关键。