通过结构比对改进用于远程同源性检测的轮廓隐马尔可夫模型的模型构建。

Improving model construction of profile HMMs for remote homology detection through structural alignment.

作者信息

Bernardes Juliana S, Dávila Alberto M R, Costa Vítor S, Zaverucha Gerson

机构信息

COPPE, Programa de Engenharia de Sistemas e Computação, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil.

出版信息

BMC Bioinformatics. 2007 Nov 9;8:435. doi: 10.1186/1471-2105-8-435.

DOI:10.1186/1471-2105-8-435

PMID:17999748

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2245980/

Abstract

BACKGROUND

Remote homology detection is a challenging problem in Bioinformatics. Arguably, profile Hidden Markov Models (pHMMs) are one of the most successful approaches in addressing this important problem. pHMM packages present a relatively small computational cost, and perform particularly well at recognizing remote homologies. This raises the question of whether structural alignments could impact the performance of pHMMs trained from proteins in the Twilight Zone, as structural alignments are often more accurate than sequence alignments at identifying motifs and functional residues. Next, we assess the impact of using structural alignments in pHMM performance.

RESULTS

We used the SCOP database to perform our experiments. Structural alignments were obtained using the 3DCOFFEE and MAMMOTH-mult tools; sequence alignments were obtained using CLUSTALW, TCOFFEE, MAFFT and PROBCONS. We performed leave-one-family-out cross-validation over super-families. Performance was evaluated through ROC curves and paired two tailed t-test.

CONCLUSION

We observed that pHMMs derived from structural alignments performed significantly better than pHMMs derived from sequence alignment in low-identity regions, mainly below 20%. We believe this is because structural alignment tools are better at focusing on the important patterns that are more often conserved through evolution, resulting in higher quality pHMMs. On the other hand, sensitivity of these tools is still quite low for these low-identity regions. Our results suggest a number of possible directions for improvements in this area.

摘要

背景

远程同源性检测是生物信息学中的一个具有挑战性的问题。可以说，轮廓隐马尔可夫模型（pHMM）是解决这一重要问题最成功的方法之一。pHMM软件包的计算成本相对较低，在识别远程同源性方面表现尤其出色。这就提出了一个问题，即结构比对是否会影响从处于“黄昏区”的蛋白质训练得到的pHMM的性能，因为在识别基序和功能残基方面，结构比对通常比序列比对更准确。接下来，我们评估在pHMM性能中使用结构比对的影响。

结果

我们使用SCOP数据库进行实验。使用3DCOFFEE和MAMMOTH-mult工具获得结构比对；使用CLUSTALW、TCOFFEE、MAFFT和PROBCONS获得序列比对。我们对超家族进行留一法交叉验证。通过ROC曲线和配对双尾t检验评估性能。

结论

我们观察到，在低同一性区域，即主要低于20%的区域，从结构比对得到的pHMM比从序列比对得到的pHMM表现明显更好。我们认为这是因为结构比对工具更善于关注那些在进化过程中更常保守的重要模式，从而产生更高质量的pHMM。另一方面，这些工具在这些低同一性区域的敏感性仍然相当低。我们的结果为该领域的改进提出了一些可能的方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/289b/2245980/cbda8b202cc2/1471-2105-8-435-1.jpg

相似文献

Improving model construction of profile HMMs for remote homology detection through structural alignment.通过结构比对改进用于远程同源性检测的轮廓隐马尔可夫模型的模型构建。

BMC Bioinformatics. 2007 Nov 9;8:435. doi: 10.1186/1471-2105-8-435.

Accuracy of structure-based sequence alignment of automatic methods.自动方法的基于结构的序列比对准确性。

BMC Bioinformatics. 2007 Sep 20;8:355. doi: 10.1186/1471-2105-8-355.

Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction.蛋白质结构比对在用于结构预测的迭代隐马尔可夫模型协议中的应用。

BMC Bioinformatics. 2006 Sep 14;7:410. doi: 10.1186/1471-2105-7-410.

HMM-ModE--improved classification using profile hidden Markov models by optimising the discrimination threshold and modifying emission probabilities with negative training sequences.HMM-ModE——通过优化判别阈值并利用负训练序列修改发射概率，使用轮廓隐马尔可夫模型改进分类。

BMC Bioinformatics. 2007 Mar 27;8:104. doi: 10.1186/1471-2105-8-104.

A new progressive-iterative algorithm for multiple structure alignment.一种用于多结构比对的新型渐进迭代算法。

Bioinformatics. 2005 Aug 1;21(15):3255-63. doi: 10.1093/bioinformatics/bti527. Epub 2005 Jun 7.

Protein homology detection by HMM-HMM comparison.通过隐马尔可夫模型（HMM）比较进行蛋白质同源性检测。

Bioinformatics. 2005 Apr 1;21(7):951-60. doi: 10.1093/bioinformatics/bti125. Epub 2004 Nov 5.

COACH: profile-profile alignment of protein families using hidden Markov models.COACH：使用隐马尔可夫模型对蛋白质家族进行轮廓-轮廓比对。

Bioinformatics. 2004 May 22;20(8):1309-18. doi: 10.1093/bioinformatics/bth091. Epub 2004 Feb 12.

Pair hidden Markov models on tree structures.树结构上的成对隐马尔可夫模型。

Bioinformatics. 2003;19 Suppl 1:i232-40. doi: 10.1093/bioinformatics/btg1032.

Protein homology detection using string alignment kernels.使用字符串比对核进行蛋白质同源性检测。

Bioinformatics. 2004 Jul 22;20(11):1682-9. doi: 10.1093/bioinformatics/bth141. Epub 2004 Feb 26.

Significant speedup of database searches with HMMs by search space reduction with PSSM family models.利用 PSSM 家族模型缩小搜索空间，大大提高了 HMM 对数据库的搜索速度。

Bioinformatics. 2009 Dec 15;25(24):3251-8. doi: 10.1093/bioinformatics/btp593. Epub 2009 Oct 14.

引用本文的文献

Fold-specific sequence scoring improves protein sequence matching.特定折叠序列评分可改善蛋白质序列匹配。

BMC Bioinformatics. 2016 Aug 30;17(1):328. doi: 10.1186/s12859-016-1198-z.

Parallel computing in genomic research: advances and applications.基因组研究中的并行计算：进展与应用

Adv Appl Bioinform Chem. 2015 Nov 13;8:23-35. doi: 10.2147/AABC.S64482. eCollection 2015.

Protein sequence alignment with family-specific amino acid similarity matrices.使用家族特异性氨基酸相似性矩阵进行蛋白质序列比对。

BMC Res Notes. 2011 Aug 16;4:296. doi: 10.1186/1756-0500-4-296.

A discriminative method for family-based protein remote homology detection that combines inductive logic programming and propositional models.基于归纳逻辑编程和命题模型的家族蛋白质远程同源检测的判别方法。

BMC Bioinformatics. 2011 Mar 23;12:83. doi: 10.1186/1471-2105-12-83.

Detection and architecture of small heat shock protein monomers.小分子热休克蛋白单体的检测与结构。

PLoS One. 2010 Apr 7;5(4):e9990. doi: 10.1371/journal.pone.0009990.

Hidden Markov Models and their Applications in Biological Sequence Analysis.隐马尔可夫模型及其在生物序列分析中的应用。

Curr Genomics. 2009 Sep;10(6):402-15. doi: 10.2174/138920209789177575.

Template-based protein modeling: recent methodological advances.基于模板的蛋白质建模：最新方法进展。

Curr Top Med Chem. 2010;10(1):84-94. doi: 10.2174/156802610790232314.

Accuracy analysis of multiple structure alignments.多重结构比对的准确性分析。

Protein Sci. 2009 Oct;18(10):2027-35. doi: 10.1002/pro.213.

本文引用的文献

The accuracy of several multiple sequence alignment programs for proteins.几种蛋白质多序列比对程序的准确性。

BMC Bioinformatics. 2006 Oct 24;7:471. doi: 10.1186/1471-2105-7-471.

Calibrating E-values for hidden Markov models using reverse-sequence null models.使用反向序列空模型校准隐马尔可夫模型的E值。

Bioinformatics. 2005 Nov 15;21(22):4107-15. doi: 10.1093/bioinformatics/bti629. Epub 2005 Aug 25.

Finding the biologically optimal alignment of multiple sequences.寻找多个序列的生物学最优比对。

Artif Intell Med. 2005 Sep-Oct;35(1-2):9-18. doi: 10.1016/j.artmed.2005.01.007.

Hidden Markov model-derived structural alphabet for proteins: the learning of protein local shapes captures sequence specificity.基于隐马尔可夫模型的蛋白质结构字母表：蛋白质局部形状的学习捕获序列特异性。

Biochim Biophys Acta. 2005 Aug 5;1724(3):394-403. doi: 10.1016/j.bbagen.2005.05.019.

ExonHunter: a comprehensive approach to gene finding.外显子猎手：一种全面的基因发现方法。

Bioinformatics. 2005 Jun;21 Suppl 1:i57-65. doi: 10.1093/bioinformatics/bti1040.

A new progressive-iterative algorithm for multiple structure alignment.一种用于多结构比对的新型渐进迭代算法。

Bioinformatics. 2005 Aug 1;21(15):3255-63. doi: 10.1093/bioinformatics/bti527. Epub 2005 Jun 7.

Detecting remotely related proteins by their interactions and sequence similarity.通过蛋白质之间的相互作用和序列相似性来检测远缘相关蛋白质。

Proc Natl Acad Sci U S A. 2005 May 17;102(20):7151-6. doi: 10.1073/pnas.0500831102. Epub 2005 May 9.

Improved profile HMM performance by assessment of critical algorithmic features in SAM and HMMER.通过评估SAM和HMMER中的关键算法特征提高轮廓隐马尔可夫模型性能。

BMC Bioinformatics. 2005 Apr 15;6:99. doi: 10.1186/1471-2105-6-99.

Prediction of protein interdomain linker regions by a hidden Markov model.利用隐马尔可夫模型预测蛋白质结构域间连接区域

Bioinformatics. 2005 May 15;21(10):2264-70. doi: 10.1093/bioinformatics/bti363. Epub 2005 Mar 3.

Efficient implementation of a generalized pair hidden Markov model for comparative gene finding.用于比较基因发现的广义对隐马尔可夫模型的高效实现。

Bioinformatics. 2005 May 1;21(9):1782-8. doi: 10.1093/bioinformatics/bti297. Epub 2005 Feb 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过结构比对改进用于远程同源性检测的轮廓隐马尔可夫模型的模型构建。

Improving model construction of profile HMMs for remote homology detection through structural alignment.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献