• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种通过添加替代比对概率来寻找相关序列的简单方法。

A simple method for finding related sequences by adding probabilities of alternative alignments.

机构信息

Artificial Intelligence Research Center, AIST, Tokyo 135-0064, Japan; Department of Computational Biology and Medical Sciences, University of Tokyo, Chiba 277-8568, Japan; Computational Bio Big Data Open Innovation Laboratory, AIST, Tokyo 169-8555, Japan

出版信息

Genome Res. 2024 Sep 20;34(8):1165-1173. doi: 10.1101/gr.279464.124.

DOI:10.1101/gr.279464.124
PMID:39152037
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11444175/
Abstract

The main way of analyzing genetic sequences is by finding sequence regions that are related to each other. There are many methods to do that, usually based on this idea: Find an alignment of two sequence regions, which would be unlikely to exist between unrelated sequences. Unfortunately, it is hard to tell if an alignment is likely to exist by chance. Also, the precise alignment of related regions is uncertain. One alignment does not hold all evidence that they are related. We should consider alternative alignments too. This is rarely done, because we lack a simple and fast method that fits easily into practical sequence-search software. Described here is the simplest-conceivable change to standard sequence alignment, which sums probabilities of alternative alignments and makes it easier to tell if a similarity is likely to occur by chance. This approach is better than standard alignment at finding distant relationships, at least in a few tests. It can be used in practical sequence-search software, with minimal increase in implementation difficulty or run time. It generalizes to different kinds of alignment, for example, DNA-versus-protein with frameshifts. Thus, it can widely contribute to finding subtle relationships between sequences.

摘要

分析遗传序列的主要方法是找到相互关联的序列区域。有许多方法可以做到这一点,通常基于这样的想法:找到两个序列区域的比对,这在不相关的序列之间不太可能存在。不幸的是,很难判断比对是否可能是偶然的。此外,相关区域的精确比对也不确定。一个比对并不能包含它们相关的所有证据。我们也应该考虑替代比对。这很少被做,因为我们缺乏一种简单而快速的方法,它很容易适应实用的序列搜索软件。这里描述的是标准序列比对中最简单的设想的改变,它对替代比对的概率进行求和,更容易判断相似性是否可能是偶然发生的。这种方法在发现远距离关系方面比标准比对要好,至少在一些测试中是这样。它可以在实用的序列搜索软件中使用,实现难度或运行时间的增加最小。它推广到不同类型的比对,例如带有移码的 DNA 与蛋白质比对。因此,它可以广泛有助于发现序列之间微妙的关系。

相似文献

1
A simple method for finding related sequences by adding probabilities of alternative alignments.一种通过添加替代比对概率来寻找相关序列的简单方法。
Genome Res. 2024 Sep 20;34(8):1165-1173. doi: 10.1101/gr.279464.124.
2
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.使用准比对快速发现和可视化 DNA 序列中的保守区域。
BMC Bioinformatics. 2013;14 Suppl 11(Suppl 11):S2. doi: 10.1186/1471-2105-14-S11-S2. Epub 2013 Sep 13.
3
CSA: an efficient algorithm to improve circular DNA multiple alignment.CSA:一种改进环状DNA多重比对的高效算法。
BMC Bioinformatics. 2009 Jul 23;10:230. doi: 10.1186/1471-2105-10-230.
4
Computing posterior probabilities for score-based alignments using ppALIGN.使用ppALIGN计算基于得分的比对的后验概率。
Stat Appl Genet Mol Biol. 2012 May 16;11(4):Article 1. doi: 10.1515/1544-6115.1702.
5
The Sequence Alignment/Map format and SAMtools.序列比对/映射格式和 SAMtools。
Bioinformatics. 2009 Aug 15;25(16):2078-9. doi: 10.1093/bioinformatics/btp352. Epub 2009 Jun 8.
6
ReformAlign: improved multiple sequence alignments using a profile-based meta-alignment approach.ReformAlign:基于轮廓的元对齐方法改进的多重序列比对。
BMC Bioinformatics. 2014 Aug 7;15(1):265. doi: 10.1186/1471-2105-15-265.
7
AlignStat: a web-tool and R package for statistical comparison of alternative multiple sequence alignments.AlignStat:一个用于对多个备选序列比对进行统计比较的网络工具和R包。
BMC Bioinformatics. 2016 Oct 26;17(1):434. doi: 10.1186/s12859-016-1300-6.
8
A novel partial sequence alignment tool for finding large deletions.一种用于查找大片段缺失的新型局部序列比对工具。
ScientificWorldJournal. 2012;2012:694813. doi: 10.1100/2012/694813. Epub 2012 Apr 1.
9
Using CLUSTAL for multiple sequence alignments.使用CLUSTAL进行多序列比对。
Methods Enzymol. 1996;266:383-402. doi: 10.1016/s0076-6879(96)66024-8.
10
Aligning Protein-Coding Nucleotide Sequences with MACSE.使用MACSE比对蛋白质编码核苷酸序列。
Methods Mol Biol. 2021;2231:51-70. doi: 10.1007/978-1-0716-1036-7_4.

引用本文的文献

1
NEAR: neural embeddings for amino acid relationships.NEAR:用于氨基酸关系的神经嵌入
Bioinformatics. 2025 Jul 1;41(Supplement_1):i449-i457. doi: 10.1093/bioinformatics/btaf198.
2
NEAR: Neural Embeddings for Amino acid Relationships.NEAR:用于氨基酸关系的神经嵌入
bioRxiv. 2025 Apr 9:2024.01.25.577287. doi: 10.1101/2024.01.25.577287.
3
nail: software for high-speed, high-sensitivity protein sequence annotation.NAIL:用于高速、高灵敏度蛋白质序列注释的软件。
bioRxiv. 2024 Jan 30:2024.01.27.577580. doi: 10.1101/2024.01.27.577580.
4
DNA Conserved in Diverse Animals Since the Precambrian Controls Genes for Embryonic Development.自前寒武纪以来,在各种动物中都保守的 DNA 控制着胚胎发育的基因。
Mol Biol Evol. 2023 Dec 1;40(12). doi: 10.1093/molbev/msad275.