Suppr超能文献

生物序列比较的几何方面。

Geometric aspects of biological sequence comparison.

作者信息

Stojmirović Aleksandar, Yu Yi-Kuo

机构信息

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.

出版信息

J Comput Biol. 2009 Apr;16(4):579-610. doi: 10.1089/cmb.2008.0100.

Abstract

We introduce a geometric framework suitable for studying the relationships among biological sequences. In contrast to previous works, our formulation allows asymmetric distances (quasi-metrics), originating from uneven weighting of strings, which may induce non-trivial partial orders on sets of biosequences. The distances considered are more general than traditional generalized string edit distances. In particular, our framework enables non-trivial conversion between sequence similarities, both local and global, and distances. Our constructions apply to a wide class of scoring schemes and require much less restrictive gap penalties than the ones regularly used. Numerous examples are provided to illustrate the concepts introduced and their potential applications.

摘要

我们引入了一个适用于研究生物序列之间关系的几何框架。与之前的工作不同,我们的公式允许不对称距离(拟度量),其源于字符串的不均匀加权,这可能会在生物序列集上诱导出非平凡的偏序。所考虑的距离比传统的广义字符串编辑距离更具一般性。特别是,我们的框架能够在局部和全局的序列相似性与距离之间进行非平凡的转换。我们的构建适用于广泛的评分方案,并且所需的间隙罚分比常规使用的罚分限制少得多。提供了大量示例来说明所引入的概念及其潜在应用。

相似文献

1
Geometric aspects of biological sequence comparison.
J Comput Biol. 2009 Apr;16(4):579-610. doi: 10.1089/cmb.2008.0100.
2
Geometric approach to string analysis for biosequence classification.
J Integr Bioinform. 2014 Oct 23;11(3):252. doi: 10.2390/biecoll-jib-2014-252.
3
String correction using the Damerau-Levenshtein distance.
BMC Bioinformatics. 2019 Jun 6;20(Suppl 11):277. doi: 10.1186/s12859-019-2819-0.
4
Weighting in sequence space: a comparison of methods in terms of generalized sequences.
Proc Natl Acad Sci U S A. 1993 Oct 1;90(19):8777-81. doi: 10.1073/pnas.90.19.8777.
5
Locality and gaps in RNA comparison.
J Comput Biol. 2007 Oct;14(8):1074-87. doi: 10.1089/cmb.2007.0062.
7
Improved algorithms for approximate string matching (extended abstract).
BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S10. doi: 10.1186/1471-2105-10-S1-S10.
8
Dynamic programming algorithms for biological sequence comparison.
Methods Enzymol. 1992;210:575-601. doi: 10.1016/0076-6879(92)10029-d.
9
Use of directed quasi-metric distances for quantifying the information of gene families.
Biosystems. 2024 Sep;243:105256. doi: 10.1016/j.biosystems.2024.105256. Epub 2024 Jun 12.
10
A general edit distance between RNA structures.
J Comput Biol. 2002;9(2):371-88. doi: 10.1089/10665270252935511.

本文引用的文献

1
Kalign--an accurate and fast multiple sequence alignment algorithm.
BMC Bioinformatics. 2005 Dec 12;6:298. doi: 10.1186/1471-2105-6-298.
2
Protein database searches using compositionally adjusted substitution matrices.
FEBS J. 2005 Oct;272(20):5101-9. doi: 10.1111/j.1742-4658.2005.04945.x.
3
Toward an accurate statistics of gapped alignments.
Bull Math Biol. 2005 Jan;67(1):169-91. doi: 10.1016/j.bulm.2004.07.001.
4
An alternative model of amino acid replacement.
Bioinformatics. 2005 Apr 1;21(7):975-80. doi: 10.1093/bioinformatics/bti109. Epub 2004 Nov 5.
5
The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions.
Bioinformatics. 2005 Apr 1;21(7):902-11. doi: 10.1093/bioinformatics/bti070. Epub 2004 Oct 27.
6
Fast and accurate database homology search using upper bounds of local alignment scores.
Bioinformatics. 2005 Apr 1;21(7):912-21. doi: 10.1093/bioinformatics/bti076. Epub 2004 Oct 27.
7
A local alignment metric for accelerating biosequence database search.
J Comput Biol. 2004;11(1):61-82. doi: 10.1089/106652704773416894.
8
Alignment of protein sequences by their profiles.
Protein Sci. 2004 Apr;13(4):1071-87. doi: 10.1110/ps.03379804.
9
MUSCLE: multiple sequence alignment with high accuracy and high throughput.
Nucleic Acids Res. 2004 Mar 19;32(5):1792-7. doi: 10.1093/nar/gkh340. Print 2004.
10
A transition probability model for amino acid substitutions from blocks.
J Comput Biol. 2003;10(6):997-1010. doi: 10.1089/106652703322756195.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验