motif 感知 PRALINE：改进 motif 区域的比对。

Motif-Aware PRALINE: Improving the alignment of motif regions.

机构信息

Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.

出版信息

PLoS Comput Biol. 2018 Nov 1;14(11):e1006547. doi: 10.1371/journal.pcbi.1006547. eCollection 2018 Nov.

DOI:10.1371/journal.pcbi.1006547

PMID:30383764

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6233922/

Abstract

Protein or DNA motifs are sequence regions which possess biological importance. These regions are often highly conserved among homologous sequences. The generation of multiple sequence alignments (MSAs) with a correct alignment of the conserved sequence motifs is still difficult to achieve, due to the fact that the contribution of these typically short fragments is overshadowed by the rest of the sequence. Here we extended the PRALINE multiple sequence alignment program with a novel motif-aware MSA algorithm in order to address this shortcoming. This method can incorporate explicit information about the presence of externally provided sequence motifs, which is then used in the dynamic programming step by boosting the amino acid substitution matrix towards the motif. The strength of the boost is controlled by a parameter, α. Using a benchmark set of alignments we confirm that a good compromise can be found that improves the matching of motif regions while not significantly reducing the overall alignment quality. By estimating α on an unrelated set of reference alignments we find there is indeed a strong conservation signal for motifs. A number of typical but difficult MSA use cases are explored to exemplify the problems in correctly aligning functional sequence motifs and how the motif-aware alignment method can be employed to alleviate these problems.

摘要

蛋白质或 DNA 基序是具有生物学重要性的序列区域。这些区域在同源序列之间通常高度保守。由于这些通常较短的片段的贡献被序列的其余部分所掩盖，因此仍然难以实现具有保守序列基序正确对齐的多个序列比对 (MSA)。为了解决这个问题，我们在 PRALINE 多序列比对程序中扩展了一种新的基于基序的 MSA 算法。该方法可以合并关于外部提供的序列基序存在的显式信息，然后在动态规划步骤中通过将氨基酸替换矩阵向基序增强来使用该信息。增强的强度由参数 α 控制。使用一组对齐基准集，我们确认可以找到一个良好的折衷方案，在不显著降低整体对齐质量的情况下提高基地区域的匹配程度。通过在一组不相关的参考对齐集上估计 α，我们发现基序确实存在强烈的保守信号。探索了一些典型但困难的 MSA 用例，以说明正确对齐功能序列基序的问题，以及如何使用基于基序的对齐方法来缓解这些问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6601/6233922/2e03dd2ce2ba/pcbi.1006547.g001.jpg

相似文献

Motif-Aware PRALINE: Improving the alignment of motif regions.

PLoS Comput Biol. 2018 Nov 1;14(11):e1006547. doi: 10.1371/journal.pcbi.1006547. eCollection 2018 Nov.

A new protein linear motif benchmark for multiple sequence alignment software.

BMC Bioinformatics. 2008 Apr 25;9:213. doi: 10.1186/1471-2105-9-213.

AL2CO: calculation of positional conservation in a protein sequence alignment.

Bioinformatics. 2001 Aug;17(8):700-12. doi: 10.1093/bioinformatics/17.8.700.

OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy.

BMC Bioinformatics. 2003 Oct 10;4:47. doi: 10.1186/1471-2105-4-47.

Development and validation of a consistency based multiple structure alignment algorithm.

Bioinformatics. 2006 May 1;22(9):1080-7. doi: 10.1093/bioinformatics/btl046. Epub 2006 Feb 10.

Discriminative motif discovery in DNA and protein sequences using the DEME algorithm.

BMC Bioinformatics. 2007 Oct 15;8:385. doi: 10.1186/1471-2105-8-385.

Tsukuba BB: a branch and bound algorithm for local multiple alignment of DNA and protein sequences.

J Comput Biol. 2001;8(3):283-303. doi: 10.1089/10665270152530854.

PFASUM: a substitution matrix from Pfam structural alignments.

BMC Bioinformatics. 2017 Jun 5;18(1):293. doi: 10.1186/s12859-017-1703-z.

Improvement of alignment accuracy utilizing sequentially conserved motifs.

BMC Bioinformatics. 2004 Oct 28;5:167. doi: 10.1186/1471-2105-5-167.

FoldMiner: structural motif discovery using an improved superposition algorithm.

Protein Sci. 2004 Jan;13(1):278-94. doi: 10.1110/ps.03239404.

引用本文的文献

Molecular characterization and genetic diversity of and of cattle in Thailand.

Front Cell Infect Microbiol. 2022 Nov 29;12:1065963. doi: 10.3389/fcimb.2022.1065963. eCollection 2022.

Refining pairwise sequence alignments of membrane proteins by the incorporation of anchors.

PLoS One. 2021 Apr 30;16(4):e0239881. doi: 10.1371/journal.pone.0239881. eCollection 2021.

Tailor-made multiple sequence alignments using the PRALINE 2 alignment toolkit.

Bioinformatics. 2019 Dec 15;35(24):5315-5317. doi: 10.1093/bioinformatics/btz572.

Structural, Kinetic, and Mechanistic Analysis of an Asymmetric 4-Oxalocrotonate Tautomerase Trimer.

Biochemistry. 2019 Jun 4;58(22):2617-2627. doi: 10.1021/acs.biochem.9b00303. Epub 2019 May 23.

本文引用的文献

ConBind: motif-aware cross-species alignment for the identification of functional transcription factor binding sites.

Nucleic Acids Res. 2016 May 5;44(8):e72. doi: 10.1093/nar/gkv1518. Epub 2015 Dec 31.

Quantifying the displacement of mismatches in multiple sequence alignment benchmarks.

PLoS One. 2015 May 19;10(5):e0127431. doi: 10.1371/journal.pone.0127431. eCollection 2015.

HIV-1 envelope glycoprotein signatures that correlate with the development of cross-reactive neutralizing activity.

Retrovirology. 2013 Sep 23;10:102. doi: 10.1186/1742-4690-10-102.

New and continuing developments at PROSITE.

Nucleic Acids Res. 2013 Jan;41(Database issue):D344-7. doi: 10.1093/nar/gks1067. Epub 2012 Nov 17.

Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.

Mol Syst Biol. 2011 Oct 11;7:539. doi: 10.1038/msb.2011.75.

KalignP: improved multiple sequence alignments using position specific gap penalties in Kalign2.

Bioinformatics. 2011 Jun 15;27(12):1702-3. doi: 10.1093/bioinformatics/btr235. Epub 2011 Apr 19.

Proline 96 of the copper ligand loop of amicyanin regulates electron transfer from methylamine dehydrogenase by positioning other residues at the protein-protein interface.

Biochemistry. 2011 Feb 22;50(7):1265-73. doi: 10.1021/bi101794y. Epub 2011 Jan 26.

Sequence embedding for fast construction of guide trees for multiple sequence alignment.

Algorithms Mol Biol. 2010 May 14;5:21. doi: 10.1186/1748-7188-5-21.

ELM: the status of the 2010 eukaryotic linear motif resource.

Nucleic Acids Res. 2010 Jan;38(Database issue):D167-80. doi: 10.1093/nar/gkp1016. Epub 2009 Nov 17.

Jalview Version 2--a multiple sequence alignment editor and analysis workbench.

Bioinformatics. 2009 May 1;25(9):1189-91. doi: 10.1093/bioinformatics/btp033. Epub 2009 Jan 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

motif 感知 PRALINE：改进 motif 区域的比对。

Motif-Aware PRALINE: Improving the alignment of motif regions.

机构信息

Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.

出版信息

PLoS Comput Biol. 2018 Nov 1;14(11):e1006547. doi: 10.1371/journal.pcbi.1006547. eCollection 2018 Nov.

DOI:10.1371/journal.pcbi.1006547

PMID:30383764

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6233922/

Abstract

摘要

motif 感知 PRALINE：改进 motif 区域的比对。

Motif-Aware PRALINE: Improving the alignment of motif regions.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

motif 感知 PRALINE：改进 motif 区域的比对。

Motif-Aware PRALINE: Improving the alignment of motif regions.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献