• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用从头预测和 Profile-HMM 方法的组合提高原核转座元件的识别。

Improving prokaryotic transposable elements identification using a combination of de novo and profile HMM methods.

机构信息

Laboratoire Evolution, Génomes, Spéciation, CNRS UPR9034/Université Paris-Sud, Gif-sur-Yvette, France.

出版信息

BMC Genomics. 2013 Oct 11;14:700. doi: 10.1186/1471-2164-14-700.

DOI:10.1186/1471-2164-14-700
PMID:24118975
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3852290/
Abstract

BACKGROUND

Insertion Sequences (ISs) and their non-autonomous derivatives (MITEs) are important components of prokaryotic genomes inducing duplication, deletion, rearrangement or lateral gene transfers. Although ISs and MITEs are relatively simple and basic genetic elements, their detection remains a difficult task due to their remarkable sequence diversity. With the advent of high-throughput genome and metagenome sequencing technologies, the development of fast, reliable and sensitive methods of ISs and MITEs detection become an important challenge. So far, almost all studies dealing with prokaryotic transposons have used classical BLAST-based detection methods against reference libraries. Here we introduce alternative methods of detection either taking advantages of the structural properties of the elements (de novo methods) or using an additional library-based method using profile HMM searches.

RESULTS

In this study, we have developed three different work flows dedicated to ISs and MITEs detection: the first two use de novo methods detecting either repeated sequences or presence of Inverted Repeats; the third one use 28 in-house transposase alignment profiles with HMM search methods. We have compared the respective performances of each method using a reference dataset of 30 archaeal and 30 bacterial genomes in addition to simulated and real metagenomes. Compared to a BLAST-based method using ISFinder as library, de novo methods significantly improve ISs and MITEs detection. For example, in the 30 archaeal genomes, we discovered 30 new elements (+20%) in addition to the 141 multi-copies elements already detected by the BLAST approach. Many of the new elements correspond to ISs belonging to unknown or highly divergent families. The total number of MITEs has even doubled with the discovery of elements displaying very limited sequence similarities with their respective autonomous partners (mainly in the Inverted Repeats of the elements). Concerning metagenomes, with the exception of short reads data (<300 bp) for which both techniques seem equally limited, profile HMM searches considerably ameliorate the detection of transposase encoding genes (up to +50%) generating low level of false positives compare to BLAST-based methods.

CONCLUSION

Compared to classical BLAST-based methods, the sensitivity of de novo and profile HMM methods developed in this study allow a better and more reliable detection of transposons in prokaryotic genomes and metagenomes. We believed that future studies implying ISs and MITEs identification in genomic data should combine at least one de novo and one library-based method, with optimal results obtained by running the two de novo methods in addition to a library-based search. For metagenomic data, profile HMM search should be favored, a BLAST-based step is only useful to the final annotation into groups and families.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a689/3852290/b0614dec5549/1471-2164-14-700-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a689/3852290/6a4fcbe8c9ed/1471-2164-14-700-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a689/3852290/61b1532c62ed/1471-2164-14-700-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a689/3852290/b0614dec5549/1471-2164-14-700-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a689/3852290/6a4fcbe8c9ed/1471-2164-14-700-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a689/3852290/61b1532c62ed/1471-2164-14-700-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a689/3852290/b0614dec5549/1471-2164-14-700-3.jpg
摘要

背景

插入序列(ISs)及其非自主衍生物(MITEs)是诱导细菌基因组中重复、缺失、重排或水平基因转移的重要组成部分。尽管 ISs 和 MITEs 是相对简单和基本的遗传元件,但由于其显著的序列多样性,它们的检测仍然是一项具有挑战性的任务。随着高通量基因组和宏基因组测序技术的出现,开发快速、可靠和敏感的 ISs 和 MITEs 检测方法成为一个重要的挑战。到目前为止,几乎所有涉及原核转座子的研究都使用了针对参考文库的基于经典 BLAST 的检测方法。在这里,我们介绍了利用元件结构特性(从头开始方法)或使用基于附加文库的方法进行检测的替代方法,该方法使用轮廓 HMM 搜索。

结果

在这项研究中,我们开发了三种专门用于 ISs 和 MITEs 检测的不同工作流程:前两种方法使用从头开始的方法检测重复序列或倒置重复的存在;第三种方法使用 28 个内部转座酶比对轮廓进行 HMM 搜索方法。我们使用 30 个古细菌和 30 个细菌基因组的参考数据集以及模拟和真实宏基因组对每种方法的性能进行了比较。与基于 ISFinder 的 BLAST 方法相比,从头开始的方法显著提高了 ISs 和 MITEs 的检测能力。例如,在 30 个古细菌基因组中,除了 BLAST 方法已经检测到的 141 个多拷贝元件外,我们还发现了 30 个新元件(+20%)。许多新元件属于未知或高度分化的家族的 ISs。通过发现与自主伙伴具有非常有限序列相似性的元件(主要在元件的倒置重复中),MITE 的总数甚至翻了一番。关于宏基因组,除了两种技术似乎同样受到限制的短读长数据(<300 bp)外,轮廓 HMM 搜索极大地改善了转座酶编码基因的检测(高达+50%),与基于 BLAST 的方法相比产生的假阳性水平较低。

结论

与经典的基于 BLAST 的方法相比,本研究中开发的从头开始和轮廓 HMM 方法的敏感性允许在原核基因组和宏基因组中更好、更可靠地检测转座子。我们认为,未来涉及基因组数据中转座子识别的研究应至少结合一种从头开始的方法和一种基于文库的方法,通过运行两种从头开始的方法并结合基于文库的搜索,可以获得最佳结果。对于宏基因组数据,应首选轮廓 HMM 搜索,基于 BLAST 的步骤仅对最终的分组和家族注释有用。

相似文献

1
Improving prokaryotic transposable elements identification using a combination of de novo and profile HMM methods.利用从头预测和 Profile-HMM 方法的组合提高原核转座元件的识别。
BMC Genomics. 2013 Oct 11;14:700. doi: 10.1186/1471-2164-14-700.
2
Insertion sequence diversity in archaea.古菌中的插入序列多样性。
Microbiol Mol Biol Rev. 2007 Mar;71(1):121-57. doi: 10.1128/MMBR.00031-06.
3
Causes of insertion sequences abundance in prokaryotic genomes.原核生物基因组中插入序列丰度的成因。
Mol Biol Evol. 2007 Apr;24(4):969-81. doi: 10.1093/molbev/msm014. Epub 2007 Jan 23.
4
ISQuest: finding insertion sequences in prokaryotic sequence fragment data.ISQuest:在原核生物序列片段数据中寻找插入序列
Bioinformatics. 2015 Nov 1;31(21):3406-12. doi: 10.1093/bioinformatics/btv388. Epub 2015 Jun 27.
5
ISEScan: automated identification of insertion sequence elements in prokaryotic genomes.ISEScan:原核生物基因组中插入序列元件的自动识别。
Bioinformatics. 2017 Nov 1;33(21):3340-3347. doi: 10.1093/bioinformatics/btx433.
6
Identification of novel MITEs (miniature inverted-repeat transposable elements) in Coxiella burnetii: implications for protein and small RNA evolution.鉴定柯克斯体中的新型 MITEs(微型反向重复转座元件):对蛋白质和小 RNA 进化的影响。
BMC Genomics. 2018 Apr 11;19(1):247. doi: 10.1186/s12864-018-4608-y.
7
Miniature inverted-repeat transposable elements (MITEs) have been accumulated through amplification bursts and play important roles in gene expression and species diversity in Oryza sativa.微型反向重复转座元件 (MITEs) 通过扩增爆发积累,在基因表达和水稻物种多样性中发挥重要作用。
Mol Biol Evol. 2012 Mar;29(3):1005-17. doi: 10.1093/molbev/msr282. Epub 2011 Nov 16.
8
Functional Roles and Genomic Impact of Miniature Inverted-Repeat Transposable Elements (MITEs) in Prokaryotes.原核生物中微型反转录转座子(MITEs)的功能作用和基因组影响
Genes (Basel). 2024 Mar 3;15(3):328. doi: 10.3390/genes15030328.
9
MITE Tracker: an accurate approach to identify miniature inverted-repeat transposable elements in large genomes.MITE Tracker:一种在大型基因组中识别小型反转录转座元件的精确方法。
BMC Bioinformatics. 2018 Oct 3;19(1):348. doi: 10.1186/s12859-018-2376-y.
10
ISbrowser: an extension of ISfinder for visualizing insertion sequences in prokaryotic genomes.ISbrowser:ISfinder 的一个扩展,用于可视化原核基因组中的插入序列。
Nucleic Acids Res. 2010 Jan;38(Database issue):D62-8. doi: 10.1093/nar/gkp947. Epub 2009 Nov 11.

引用本文的文献

1
Functional Roles and Genomic Impact of Miniature Inverted-Repeat Transposable Elements (MITEs) in Prokaryotes.原核生物中微型反转录转座子(MITEs)的功能作用和基因组影响
Genes (Basel). 2024 Mar 3;15(3):328. doi: 10.3390/genes15030328.
2
Palidis: fast discovery of novel insertion sequences.帕利迪斯:快速发现新的插入序列。
Microb Genom. 2023 Mar;9(3). doi: 10.1099/mgen.0.000917.
3
Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification.Bakta:通过无比对序列鉴定实现细菌基因组的快速标准化注释。

本文引用的文献

1
Accelerated Profile HMM Searches.加速轮廓隐马尔可夫模型搜索。
PLoS Comput Biol. 2011 Oct;7(10):e1002195. doi: 10.1371/journal.pcbi.1002195. Epub 2011 Oct 20.
2
Do phages efficiently shuttle transposable elements among prokaryotes?噬菌体在原核生物中有效地传递可移动元件吗?
Evolution. 2011 Nov;65(11):3327-31. doi: 10.1111/j.1558-5646.2011.01395.x. Epub 2011 Sep 13.
3
Short- and long-term evolutionary dynamics of bacterial insertion sequences: insights from Wolbachia endosymbionts.细菌插入序列的短期和长期进化动态:沃尔巴克氏体共生菌的启示。
Microb Genom. 2021 Nov;7(11). doi: 10.1099/mgen.0.000685.
4
Prediction of prokaryotic transposases from protein features with machine learning approaches.基于机器学习方法的蛋白质特征预测原核转座酶。
Microb Genom. 2021 Jul;7(7). doi: 10.1099/mgen.0.000611.
5
A transposable element annotation pipeline and expression analysis reveal potentially active elements in the microalga Tisochrysis lutea.转座元件注释流水线和表达分析揭示微藻新月菱形藻中潜在活跃的元件。
BMC Genomics. 2018 May 22;19(1):378. doi: 10.1186/s12864-018-4763-1.
6
The chromosomal organization of horizontal gene transfer in bacteria.细菌中水平基因转移的染色体组织
Nat Commun. 2017 Oct 10;8(1):841. doi: 10.1038/s41467-017-00808-w.
Genome Biol Evol. 2011;3:1175-86. doi: 10.1093/gbe/evr096. Epub 2011 Sep 22.
4
Impact of small repeat sequences on bacterial genome evolution.小重复序列对细菌基因组进化的影响。
Genome Biol Evol. 2011;3:959-73. doi: 10.1093/gbe/evr077. Epub 2011 Jul 29.
5
ISsaga is an ensemble of web-based methods for high throughput identification and semi-automatic annotation of insertion sequences in prokaryotic genomes.ISsaga 是一组基于网络的方法,用于高通量鉴定和半自动化注释原核基因组中的插入序列。
Genome Biol. 2011;12(3):R30. doi: 10.1186/gb-2011-12-3-r30. Epub 2011 Mar 28.
6
DNA recognition and the precleavage state during single-stranded DNA transposition in D. radiodurans.在 D. radiodurans 中单链 DNA 转座过程中的 DNA 识别和预裂解状态。
EMBO J. 2010 Nov 17;29(22):3840-52. doi: 10.1038/emboj.2010.241. Epub 2010 Oct 1.
7
Search and clustering orders of magnitude faster than BLAST.比 BLAST 快几个数量级的搜索和聚类。
Bioinformatics. 2010 Oct 1;26(19):2460-1. doi: 10.1093/bioinformatics/btq461. Epub 2010 Aug 12.
8
Transposases are the most abundant, most ubiquitous genes in nature.转座酶是自然界中最丰富、最普遍的基因。
Nucleic Acids Res. 2010 Jul;38(13):4207-17. doi: 10.1093/nar/gkq140. Epub 2010 Mar 9.
9
Identification and characterization of repetitive extragenic palindromes (REP)-associated tyrosine transposases: implications for REP evolution and dynamics in bacterial genomes.鉴定和表征重复外显回文序列(REP)相关的酪氨酸转座酶:对 REP 在细菌基因组中的进化和动态的影响。
BMC Genomics. 2010 Jan 19;11:44. doi: 10.1186/1471-2164-11-44.
10
Identifying repeats and transposable elements in sequenced genomes: how to find your way through the dense forest of programs.在测序基因组中识别重复序列和转座元件:如何在密集的程序森林中找到自己的路。
Heredity (Edinb). 2010 Jun;104(6):520-33. doi: 10.1038/hdy.2009.165. Epub 2009 Nov 25.