对图谱与图谱之间的序列进行比对评分。

Scoring profile-to-profile sequence alignments.

作者信息

Wang Guoli, Dunbrack Roland L

机构信息

Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA.

出版信息

Protein Sci. 2004 Jun;13(6):1612-26. doi: 10.1110/ps.03601504.

DOI:10.1110/ps.03601504

PMID:15152092

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2279992/

Abstract

Sequence alignment profiles have been shown to be very powerful in creating accurate sequence alignments. Profiles are often used to search a sequence database with a local alignment algorithm. More accurate and longer alignments have been obtained with profile-to-profile comparison. There are several steps that must be performed in creating profile-profile alignments, and each involves choices in parameters and algorithms. These steps include (1) what sequences to include in a multiple alignment used to build each profile, (2) how to weight similar sequences in the multiple alignment and how to determine amino acid frequencies from the weighted alignment, (3) how to score a column from one profile aligned to a column of the other profile, (4) how to score gaps in the profile-profile alignment, and (5) how to include structural information. Large-scale benchmarks consisting of pairs of homologous proteins with structurally determined sequence alignments are necessary for evaluating the efficacy of each scoring scheme. With such a benchmark, we have investigated the properties of profile-profile alignments and found that (1) with optimized gap penalties, most column-column scoring functions behave similarly to one another in alignment accuracy; (2) some functions, however, have much higher search sensitivity and specificity; (3) position-specific weighting schemes in determining amino acid counts in columns of multiple sequence alignments are better than sequence-specific schemes; (4) removing positions in the profile with gaps in the query sequence results in better alignments; and (5) adding predicted and known secondary structure information improves alignments.

摘要

序列比对概况已被证明在创建准确的序列比对方面非常强大。概况通常用于使用局部比对算法搜索序列数据库。通过概况与概况的比较可以获得更准确和更长的比对结果。在创建概况-概况比对时必须执行几个步骤，每个步骤都涉及参数和算法的选择。这些步骤包括：（1）在用于构建每个概况的多序列比对中应包含哪些序列；（2）如何在多序列比对中对相似序列进行加权以及如何从加权比对中确定氨基酸频率；（3）如何对一个概况中的一列与另一个概况中的一列进行比对打分；（4）如何对概况-概况比对中的空位进行打分；（5）如何纳入结构信息。由具有结构确定的序列比对的同源蛋白对组成的大规模基准对于评估每种打分方案的有效性是必要的。利用这样一个基准，我们研究了概况-概况比对的特性，发现：（1）通过优化空位罚分，大多数列-列打分函数在比对准确性方面表现相似；（2）然而，一些函数具有更高的搜索灵敏度和特异性；（3）在确定多序列比对列中的氨基酸计数时，位置特异性加权方案优于序列特异性方案；（4）去除查询序列中有空位的概况中的位置会得到更好的比对结果；（5）添加预测的和已知的二级结构信息可改善比对。

相似文献

Scoring profile-to-profile sequence alignments.

Protein Sci. 2004 Jun;13(6):1612-26. doi: 10.1110/ps.03601504.

Comparison of linear gap penalties and profile-based variable gap penalties in profile-profile alignments.

Comput Biol Chem. 2011 Oct 12;35(5):308-18. doi: 10.1016/j.compbiolchem.2011.07.006. Epub 2011 Jul 22.

A comparison of scoring functions for protein sequence profile alignment.

Bioinformatics. 2004 May 22;20(8):1301-8. doi: 10.1093/bioinformatics/bth090. Epub 2004 Feb 12.

Incremental window-based protein sequence alignment algorithms.

Bioinformatics. 2007 Jan 15;23(2):e17-23. doi: 10.1093/bioinformatics/btl297.

Learning scoring schemes for sequence alignment from partial examples.

IEEE/ACM Trans Comput Biol Bioinform. 2008 Oct-Dec;5(4):546-56. doi: 10.1109/TCBB.2008.57.

Bioinformatics. 2015 Mar 1;31(5):674-81. doi: 10.1093/bioinformatics/btu697. Epub 2014 Oct 22.

Gaps in structurally similar proteins: towards improvement of multiple sequence alignment.

Proteins. 2004 Jan 1;54(1):71-87. doi: 10.1002/prot.10508.

Optimizing the size of the sequence profiles to increase the accuracy of protein sequence alignments generated by profile-profile algorithms.

Bioinformatics. 2008 May 1;24(9):1145-53. doi: 10.1093/bioinformatics/btn097. Epub 2008 Mar 12.

Multiple sequence alignment based on profile alignment of intermediate sequences.

J Comput Biol. 2008 Sep;15(7):767-77. doi: 10.1089/cmb.2007.0132.

Accuracy of structure-based sequence alignment of automatic methods.

BMC Bioinformatics. 2007 Sep 20;8:355. doi: 10.1186/1471-2105-8-355.

引用本文的文献

Identification of Evolutionary Trajectories Shared across Human Betacoronaviruses.

Genome Biol Evol. 2023 Jun 1;15(6). doi: 10.1093/gbe/evad076.

Soil Chemistry and Soil History Significantly Structure Oomycete Communities in Crop Rotations.

Appl Environ Microbiol. 2023 Jan 31;89(1):e0131422. doi: 10.1128/aem.01314-22. Epub 2023 Jan 11.

Contrastive learning on protein embeddings enlightens midnight zone.

NAR Genom Bioinform. 2022 Jun 11;4(2):lqac043. doi: 10.1093/nargab/lqac043. eCollection 2022 Jun.

PPalign: optimal alignment of Potts models representing proteins with direct coupling information.

BMC Bioinformatics. 2021 Jun 10;22(1):317. doi: 10.1186/s12859-021-04222-4.

Estimating statistical significance of local protein profile-profile alignments.

BMC Bioinformatics. 2019 Aug 13;20(1):419. doi: 10.1186/s12859-019-2913-3.

A Sequential Segment Based Alpha-Helical Transmembrane Protein Alignment Method.

Int J Biol Sci. 2018 May 22;14(8):901-906. doi: 10.7150/ijbs.24327. eCollection 2018.

Alignment Modulates Ancestral Sequence Reconstruction Accuracy.

Mol Biol Evol. 2018 Jul 1;35(7):1783-1797. doi: 10.1093/molbev/msy055.

Evidence of Divergent Amino Acid Usage in Comparative Analyses of R5- and X4-Associated HIV-1 Vpr Sequences.

Int J Genomics. 2017;2017:4081585. doi: 10.1155/2017/4081585. Epub 2017 May 17.

MARS: improving multiple circular sequence alignment using refined sequences.

BMC Genomics. 2017 Jan 14;18(1):86. doi: 10.1186/s12864-016-3477-5.

Cluster Analysis of p53 Binding Site Sequences Reveals Subsets with Different Functions.

Cancer Inform. 2016 Oct 25;15:199-209. doi: 10.4137/CIN.S39968. eCollection 2016.

本文引用的文献

Database resources of the National Center for Biotechnology Information: update.

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D35-40. doi: 10.1093/nar/gkh073.

Assessment of progress over the CASP experiments.

Proteins. 2003;53 Suppl 6:585-95. doi: 10.1002/prot.10530.

A graph-theory algorithm for rapid protein side-chain prediction.

Protein Sci. 2003 Sep;12(9):2001-14. doi: 10.1110/ps.03154503.

Probabilistic scoring measures for profile-profile comparison yield more accurate short seed alignments.

Bioinformatics. 2003 Aug 12;19(12):1531-9. doi: 10.1093/bioinformatics/btg185.

ORFeus: Detection of distant homology using sequence profiles and predicted secondary structure.

Nucleic Acids Res. 2003 Jul 1;31(13):3804-7. doi: 10.1093/nar/gkg504.

Cyclic coordinate descent: A robotics algorithm for protein loop closure.

Protein Sci. 2003 May;12(5):963-72. doi: 10.1110/ps.0242703.

CASP and CAFASP experiments and their findings.

Methods Biochem Anal. 2003;44:501-7.

Profile-profile alignment: a powerful tool for protein structure prediction.

Pac Symp Biocomput. 2003:252-63.

PCMA: fast and accurate multiple sequence alignment based on profile consistency.

Bioinformatics. 2003 Feb 12;19(3):427-8. doi: 10.1093/bioinformatics/btg008.

COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance.

J Mol Biol. 2003 Feb 7;326(1):317-36. doi: 10.1016/s0022-2836(02)01371-2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

对图谱与图谱之间的序列进行比对评分。

Scoring profile-to-profile sequence alignments.

作者信息

Wang Guoli, Dunbrack Roland L

机构信息

Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA.

出版信息

Protein Sci. 2004 Jun;13(6):1612-26. doi: 10.1110/ps.03601504.

DOI:10.1110/ps.03601504

PMID:15152092

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2279992/

Abstract

摘要

对图谱与图谱之间的序列进行比对评分。

Scoring profile-to-profile sequence alignments.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

对图谱与图谱之间的序列进行比对评分。

Scoring profile-to-profile sequence alignments.

作者信息

机构信息

出版信息