• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

转座元件家族重建中多序列比对方法的准确性

Accuracy of multiple sequence alignment methods in the reconstruction of transposable element families.

作者信息

Hubley Robert, Wheeler Travis J, Smit Arian F A

机构信息

Institute for Systems Biology, Seattle, WA 98109, USA.

Department of Computer Science, University of Montana, Missoula, MT 59801, USA.

出版信息

NAR Genom Bioinform. 2022 May 17;4(2):lqac040. doi: 10.1093/nargab/lqac040. eCollection 2022 Jun.

DOI:10.1093/nargab/lqac040
PMID:35591887
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9112768/
Abstract

The construction of a high-quality multiple sequence alignment (MSA) from copies of a transposable element (TE) is a critical step in the characterization of a new TE family. Most studies of MSA accuracy have been conducted on protein or RNA sequence families, where structural features and strong signals of selection may assist with alignment. Less attention has been given to the quality of sequence alignments involving neutrally evolving DNA sequences such as those resulting from TE replication. Transposable element sequences are challenging to align due to their wide divergence ranges, fragmentation, and predominantly-neutral mutation patterns. To gain insight into the effects of these properties on MSA accuracy, we developed a simulator of TE sequence evolution, and used it to generate a benchmark with which we evaluated the MSA predictions produced by several popular aligners, along with Refiner, a method we developed in the context of our RepeatModeler software. We find that MAFFT and Refiner generally outperform other aligners for low to medium divergence simulated sequences, while Refiner is uniquely effective when tasked with aligning high-divergent and fragmented instances of a family.

摘要

从转座元件(TE)的拷贝构建高质量的多序列比对(MSA)是鉴定新TE家族的关键步骤。大多数关于MSA准确性的研究是在蛋白质或RNA序列家族上进行的,其中结构特征和强烈的选择信号可能有助于比对。对于涉及中性进化DNA序列(如TE复制产生的序列)的序列比对质量,关注较少。转座元件序列由于其广泛的分歧范围、片段化和主要为中性的突变模式,难以进行比对。为了深入了解这些特性对MSA准确性的影响,我们开发了一个TE序列进化模拟器,并用它生成了一个基准,我们用这个基准评估了几种流行比对工具以及我们在RepeatModeler软件中开发的Refiner方法所产生的MSA预测。我们发现,对于低到中等分歧的模拟序列,MAFFT和Refiner通常比其他比对工具表现更好,而当任务是比对一个家族的高分歧和片段化实例时,Refiner具有独特的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/5a8f9a001eae/lqac040fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/db6d58aa1fc1/lqac040fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/978227377593/lqac040fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/501d95ac411d/lqac040fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/106c8e3e1b33/lqac040fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/8c398e5b06cb/lqac040fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/4e50d6490f41/lqac040fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/5a8f9a001eae/lqac040fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/db6d58aa1fc1/lqac040fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/978227377593/lqac040fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/501d95ac411d/lqac040fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/106c8e3e1b33/lqac040fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/8c398e5b06cb/lqac040fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/4e50d6490f41/lqac040fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a674/9112768/5a8f9a001eae/lqac040fig7.jpg

相似文献

1
Accuracy of multiple sequence alignment methods in the reconstruction of transposable element families.转座元件家族重建中多序列比对方法的准确性
NAR Genom Bioinform. 2022 May 17;4(2):lqac040. doi: 10.1093/nargab/lqac040. eCollection 2022 Jun.
2
Mind the gaps: evidence of bias in estimates of multiple sequence alignments.注意差距:多重序列比对估计中的偏差证据。
Mol Biol Evol. 2007 Nov;24(11):2433-42. doi: 10.1093/molbev/msm176. Epub 2007 Aug 20.
3
Characterization of multiple sequence alignment errors using complete-likelihood score and position-shift map.使用完全似然得分和位置偏移图对多序列比对错误进行表征。
BMC Bioinformatics. 2016 Mar 18;17:133. doi: 10.1186/s12859-016-0945-5.
4
Alignment Modulates Ancestral Sequence Reconstruction Accuracy.比对方式调节祖先序列重建准确性。
Mol Biol Evol. 2018 Jul 1;35(7):1783-1797. doi: 10.1093/molbev/msy055.
5
Protein multiple sequence alignment benchmarking through secondary structure prediction.通过二级结构预测进行蛋白质多序列比对基准测试。
Bioinformatics. 2017 May 1;33(9):1331-1337. doi: 10.1093/bioinformatics/btw840.
6
RPfam: A refiner towards curated-like multiple sequence alignments of the Pfam protein families.RPfam:一个针对 Pfam 蛋白质家族精心整理的多重序列比对的工具。
J Bioinform Comput Biol. 2022 Aug;20(4):2240002. doi: 10.1142/S0219720022400029. Epub 2022 Apr 14.
7
MAFFT-DASH: integrated protein sequence and structural alignment.MAFFT-DASH:集成蛋白质序列和结构比对。
Nucleic Acids Res. 2019 Jul 2;47(W1):W5-W10. doi: 10.1093/nar/gkz342.
8
A Characteristic-Based Framework for Multiple Sequence Aligners.基于特征的多序列比对框架。
IEEE Trans Cybern. 2018 Jan;48(1):41-51. doi: 10.1109/TCYB.2016.2621129. Epub 2016 Nov 2.
9
Curation Guidelines for de novo Generated Transposable Element Families.从头生成的转座元件家族的策管指南。
Curr Protoc. 2021 Jun;1(6):e154. doi: 10.1002/cpz1.154.
10
Assessing the efficiency of multiple sequence alignment programs.评估多序列比对程序的效率。
Algorithms Mol Biol. 2014 Mar 6;9(1):4. doi: 10.1186/1748-7188-9-4.

引用本文的文献

1
Further varieties of ancient endogenous retrovirus in human DNA.人类DNA中古代内源性逆转录病毒的更多种类。
Mob DNA. 2025 Mar 13;16(1):11. doi: 10.1186/s13100-025-00348-x.
2
HiTE: a fast and accurate dynamic boundary adjustment approach for full-length transposable element detection and annotation.HiTE:一种快速准确的动态边界调整方法,用于全长转座子的检测和注释。
Nat Commun. 2024 Jul 2;15(1):5573. doi: 10.1038/s41467-024-49912-8.
3
The good, the bad and the ugly of transposable elements annotation tools.转座元件注释工具的优劣与问题

本文引用的文献

1
Curation Guidelines for de novo Generated Transposable Element Families.从头生成的转座元件家族的策管指南。
Curr Protoc. 2021 Jun;1(6):e154. doi: 10.1002/cpz1.154.
2
The Dfam community resource of transposable element families, sequence models, and genome annotations.转座元件家族、序列模型和基因组注释的Dfam社区资源。
Mob DNA. 2021 Jan 12;12(1):2. doi: 10.1186/s13100-020-00230-y.
3
A comparative genomics multitool for scientific discovery and conservation.用于科学发现和保护的比较基因组学多用途工具。
Genet Mol Biol. 2024 Feb 19;46(3 Suppl 1):e20230138. doi: 10.1590/1678-4685-GMB-2023-0138. eCollection 2024.
4
An immune-suppressing protein in human endogenous retroviruses.人类内源性逆转录病毒中的一种免疫抑制蛋白。
Bioinform Adv. 2023 Feb 2;3(1):vbad013. doi: 10.1093/bioadv/vbad013. eCollection 2023.
5
Recent Advances in Antibiotic-Free Markers; Novel Technologies to Enhance Safe Human Food Production in the World.抗生素替代标记物的最新进展;提高全球安全人类食品生产的新技术。
Mol Biotechnol. 2023 Jul;65(7):1011-1022. doi: 10.1007/s12033-022-00609-7. Epub 2022 Nov 29.
6
Insights from analyses of low complexity regions with canonical methods for protein sequence comparison.利用用于蛋白质序列比较的规范方法分析低复杂度区域的见解。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac299.
7
A beginner's guide to manual curation of transposable elements.转座元件人工筛选入门指南。
Mob DNA. 2022 Mar 30;13(1):7. doi: 10.1186/s13100-021-00259-7.
Nature. 2020 Nov;587(7833):240-245. doi: 10.1038/s41586-020-2876-6. Epub 2020 Nov 11.
4
RepeatModeler2 for automated genomic discovery of transposable element families.RepeatModeler2 用于自动发现转座元件家族的基因组。
Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9451-9457. doi: 10.1073/pnas.1921046117. Epub 2020 Apr 16.
5
SAliBASE: A Database of Simulated Protein Alignments.SAliBASE:模拟蛋白质比对数据库。
Evol Bioinform Online. 2019 Jan 17;15:1176934318821080. doi: 10.1177/1176934318821080. eCollection 2019.
6
Evaluating Statistical Multiple Sequence Alignment in Comparison to Other Alignment Methods on Protein Data Sets.评估统计多重序列比对与蛋白质数据集上其他比对方法的比较。
Syst Biol. 2019 May 1;68(3):396-411. doi: 10.1093/sysbio/syy068.
7
EXPERIMENTAL MOLECULAR EVOLUTION OF BACTERIOPHAGE T7.噬菌体T7的实验性分子进化
Evolution. 1993 Aug;47(4):993-1007. doi: 10.1111/j.1558-5646.1993.tb02130.x.
8
Nextflow enables reproducible computational workflows.Nextflow支持可重复的计算工作流程。
Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820.
9
Novel Insights into Chromosome Evolution in Birds, Archosaurs, and Reptiles.对鸟类、主龙类和爬行动物染色体进化的新见解。
Genome Biol Evol. 2016 Aug 25;8(8):2442-51. doi: 10.1093/gbe/evw166.
10
p53 genes function to restrain mobile elements.p53基因的功能是抑制移动元件。
Genes Dev. 2016 Jan 1;30(1):64-77. doi: 10.1101/gad.266098.115. Epub 2015 Dec 23.