• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

后校对调整及其自动化。

Post-Alignment Adjustment and Its Automation.

机构信息

Department of Biology, University of Ottawa, Marie-Curie Private, Ottawa, ON K1N 9A7, Canada.

Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, ON K1H 8M5, Canada.

出版信息

Genes (Basel). 2021 Nov 18;12(11):1809. doi: 10.3390/genes12111809.

DOI:10.3390/genes12111809
PMID:34828415
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8623120/
Abstract

Multiple sequence alignment (MSA) is the basis for almost all sequence comparison and molecular phylogenetic inferences. Large-scale genomic analyses are typically associated with automated progressive MSA without subsequent manual adjustment, which itself is often error-prone because of the lack of a consistent and explicit criterion. Here, I outlined several commonly encountered alignment errors that cannot be avoided by progressive MSA for nucleotide, amino acid, and codon sequences. Methods that could be automated to fix such alignment errors were then presented. I emphasized the utility of position weight matrix as a new tool for MSA refinement and illustrated its usage by refining the MSA of nucleotide and amino acid sequences. The main advantages of the position weight matrix approach include (1) its use of information from all sequences, in contrast to other commonly used methods based on pairwise alignment scores and inconsistency measures, and (2) its speedy computation, making it suitable for a large number of long viral genomic sequences.

摘要

多序列比对(MSA)是几乎所有序列比较和分子系统发育推断的基础。大规模基因组分析通常与自动渐进 MSA 相关联,而无需随后进行手动调整,由于缺乏一致和明确的标准,这种方法本身往往容易出错。在这里,我概述了核苷酸、氨基酸和密码子序列的渐进 MSA 无法避免的几种常见比对错误。然后提出了可以自动化修复此类比对错误的方法。我强调了位置权重矩阵作为 MSA 细化的新工具的实用性,并通过细化核苷酸和氨基酸序列的 MSA 来说明其用法。位置权重矩阵方法的主要优点包括:(1)它使用所有序列的信息,与基于成对比对得分和不一致性度量的其他常用方法形成对比;(2)它的快速计算,使其适用于大量长的病毒基因组序列。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79f9/8623120/ebfc95c58e18/genes-12-01809-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79f9/8623120/ce098f1207bf/genes-12-01809-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79f9/8623120/1c43d634bc8c/genes-12-01809-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79f9/8623120/451b0f42a37e/genes-12-01809-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79f9/8623120/ebfc95c58e18/genes-12-01809-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79f9/8623120/ce098f1207bf/genes-12-01809-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79f9/8623120/1c43d634bc8c/genes-12-01809-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79f9/8623120/451b0f42a37e/genes-12-01809-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79f9/8623120/ebfc95c58e18/genes-12-01809-g004.jpg

相似文献

1
Post-Alignment Adjustment and Its Automation.后校对调整及其自动化。
Genes (Basel). 2021 Nov 18;12(11):1809. doi: 10.3390/genes12111809.
2
Protein multiple sequence alignment benchmarking through secondary structure prediction.通过二级结构预测进行蛋白质多序列比对基准测试。
Bioinformatics. 2017 May 1;33(9):1331-1337. doi: 10.1093/bioinformatics/btw840.
3
PhyPA: Phylogenetic method with pairwise sequence alignment outperforms likelihood methods in phylogenetics involving highly diverged sequences.PhyPA:一种结合成对序列比对的系统发育方法,在涉及高度分化序列的系统发育分析中,其性能优于似然法。
Mol Phylogenet Evol. 2016 Sep;102:331-43. doi: 10.1016/j.ympev.2016.07.001. Epub 2016 Jul 1.
4
APDB: a novel measure for benchmarking sequence alignment methods without reference alignments.APDB:一种用于在没有参考比对的情况下对序列比对方法进行基准测试的新方法。
Bioinformatics. 2003;19 Suppl 1:i215-21. doi: 10.1093/bioinformatics/btg1029.
5
Grammar-based distance in progressive multiple sequence alignment.渐进多序列比对中基于语法的距离
BMC Bioinformatics. 2008 Jul 10;9:306. doi: 10.1186/1471-2105-9-306.
6
DNA Multiple Sequence Alignment Guided by Protein Domains: The MSA-PAD 2.0 Method.由蛋白质结构域引导的DNA多序列比对:MSA-PAD 2.0方法
Methods Mol Biol. 2018;1746:173-180. doi: 10.1007/978-1-4939-7683-6_13.
7
Where did the BLOSUM62 alignment score matrix come from?BLOSUM62比对得分矩阵是从哪里来的?
Nat Biotechnol. 2004 Aug;22(8):1035-6. doi: 10.1038/nbt0804-1035.
8
Kalign--an accurate and fast multiple sequence alignment algorithm.Kalign——一种准确且快速的多序列比对算法。
BMC Bioinformatics. 2005 Dec 12;6:298. doi: 10.1186/1471-2105-6-298.
9
SpliVert: A Protein Multiple Sequence Alignment Refinement Method Based on Splitting-Splicing Vertically.SpliVert:一种基于垂直拆分-拼接的蛋白质多序列比对优化方法。
Protein Pept Lett. 2020;27(4):295-302. doi: 10.2174/0929866526666190806143959.
10
Class of multiple sequence alignment algorithm affects genomic analysis.多序列比对算法的类别会影响基因组分析。
Mol Biol Evol. 2013 Mar;30(3):642-53. doi: 10.1093/molbev/mss256. Epub 2012 Nov 9.

引用本文的文献

1
Heuristic Pairwise Alignment in Database Environments.启发式两两比对在数据库环境中的应用。
Genes (Basel). 2022 Nov 2;13(11):2005. doi: 10.3390/genes13112005.

本文引用的文献

1
Dating the Common Ancestor from an NCBI Tree of 83688 High-Quality and Full-Length SARS-CoV-2 Genomes.从 83688 个高质量和全长 SARS-CoV-2 基因组的 NCBI 树中追溯共同祖先。
Viruses. 2021 Sep 8;13(9):1790. doi: 10.3390/v13091790.
2
Predicting mammalian species at risk of being infected by SARS-CoV-2 from an ACE2 perspective.从 ACE2 角度预测易感染 SARS-CoV-2 的哺乳动物物种。
Sci Rep. 2021 Jan 18;11(1):1702. doi: 10.1038/s41598-020-80573-x.
3
Beyond Trees: Regulons and Regulatory Motif Characterization.超越树状结构:调控子和调控基序特征描述。
Genes (Basel). 2020 Aug 25;11(9):995. doi: 10.3390/genes11090995.
4
Extreme Genomic CpG Deficiency in SARS-CoV-2 and Evasion of Host Antiviral Defense.SARS-CoV-2 极端基因组 CpG 缺乏与宿主抗病毒防御的逃逸。
Mol Biol Evol. 2020 Sep 1;37(9):2699-2705. doi: 10.1093/molbev/msaa094.
5
Major Revisions in Arthropod Phylogeny Through Improved Supermatrix, With Support for Two Possible Waves of Land Invasion by Chelicerates.通过改进的超矩阵对节肢动物系统发育进行重大修订,支持螯肢动物两次可能的陆地入侵浪潮。
Evol Bioinform Online. 2020 Feb 5;16:1176934320903735. doi: 10.1177/1176934320903735. eCollection 2020.
6
Dynamics of strand slippage in DNA hairpins formed by CAG repeats: roles of sequence parity and trinucleotide interrupts.CAG 重复形成的 DNA 发夹结构中的链滑动动力学:序列奇偶性和三核苷酸中断的作用。
Nucleic Acids Res. 2020 Mar 18;48(5):2232-2245. doi: 10.1093/nar/gkaa036.
7
DAMBE7: New and Improved Tools for Data Analysis in Molecular Biology and Evolution.DAMBE7:用于分子生物学和进化数据分析的新改进工具。
Mol Biol Evol. 2018 Jun 1;35(6):1550-1552. doi: 10.1093/molbev/msy073.
8
DAMBE6: New Tools for Microbial Genomics, Phylogenetics, and Molecular Evolution.DAMBE6:微生物基因组学、系统发育学和分子进化的新工具。
J Hered. 2017 Jun 1;108(4):431-437. doi: 10.1093/jhered/esx033.
9
PhyPA: Phylogenetic method with pairwise sequence alignment outperforms likelihood methods in phylogenetics involving highly diverged sequences.PhyPA:一种结合成对序列比对的系统发育方法,在涉及高度分化序列的系统发育分析中,其性能优于似然法。
Mol Phylogenet Evol. 2016 Sep;102:331-43. doi: 10.1016/j.ympev.2016.07.001. Epub 2016 Jul 1.
10
PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.PSI/TM-Coffee:一个利用简化数据库上的同源性延伸对常规和跨膜蛋白进行快速准确多序列比对的网络服务器。
Nucleic Acids Res. 2016 Jul 8;44(W1):W339-43. doi: 10.1093/nar/gkw300. Epub 2016 Apr 22.