使用麦克林托克2对转座元件检测器进行可重复评估，可准确推断酵母中Ty插入模式。

Reproducible evaluation of transposable element detectors with McClintock 2 guides accurate inference of Ty insertion patterns in yeast.

作者信息

Chen Jingxuan, Basting Preston J, Han Shunhua, Garfinkel David J, Bergman Casey M

机构信息

Institute of Bioinformatics, University of Georgia, Athens, GA, USA.

Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, USA.

出版信息

Mob DNA. 2023 Jul 14;14(1):8. doi: 10.1186/s13100-023-00296-4.

DOI:10.1186/s13100-023-00296-4

PMID:37452430

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10347736/

Abstract

BACKGROUND

Many computational methods have been developed to detect non-reference transposable element (TE) insertions using short-read whole genome sequencing data. The diversity and complexity of such methods often present challenges to new users seeking to reproducibly install, execute, or evaluate multiple TE insertion detectors.

RESULTS

We previously developed the McClintock meta-pipeline to facilitate the installation, execution, and evaluation of six first-generation short-read TE detectors. Here, we report a completely re-implemented version of McClintock written in Python using Snakemake and Conda that improves its installation, error handling, speed, stability, and extensibility. McClintock 2 now includes 12 short-read TE detectors, auxiliary pre-processing and analysis modules, interactive HTML reports, and a simulation framework to reproducibly evaluate the accuracy of component TE detectors. When applied to the model microbial eukaryote Saccharomyces cerevisiae, we find substantial variation in the ability of McClintock 2 components to identify the precise locations of non-reference TE insertions, with RelocaTE2 showing the highest recall and precision in simulated data. We find that RelocaTE2, TEMP, TEMP2 and TEBreak provide consistent estimates of [Formula: see text]50 non-reference TE insertions per strain and that Ty2 has the highest number of non-reference TE insertions in a species-wide panel of [Formula: see text]1000 yeast genomes. Finally, we show that best-in-class predictors for yeast applied to resequencing data have sufficient resolution to reveal a dyad pattern of integration in nucleosome-bound regions upstream of yeast tRNA genes for Ty1, Ty2, and Ty4, allowing us to extend knowledge about fine-scale target preferences revealed previously for experimentally-induced Ty1 insertions to spontaneous insertions for other copia-superfamily retrotransposons in yeast.

CONCLUSION

McClintock ( https://github.com/bergmanlab/mcclintock/ ) provides a user-friendly pipeline for the identification of TEs in short-read WGS data using multiple TE detectors, which should benefit researchers studying TE insertion variation in a wide range of different organisms. Application of the improved McClintock system to simulated and empirical yeast genome data reveals best-in-class methods and novel biological insights for one of the most widely-studied model eukaryotes and provides a paradigm for evaluating and selecting non-reference TE detectors in other species.

摘要

背景

已经开发了许多计算方法，用于使用短读长全基因组测序数据检测非参考转座元件（TE）插入。这些方法的多样性和复杂性常常给试图可重复地安装、执行或评估多个TE插入检测器的新用户带来挑战。

结果

我们之前开发了McClintock元管道，以促进六种第一代短读长TE检测器的安装、执行和评估。在这里，我们报告了一个完全用Python重新实现的McClintock版本，它使用Snakemake和Conda，改进了其安装、错误处理、速度、稳定性和可扩展性。McClintock 2现在包括12种短读长TE检测器、辅助预处理和分析模块、交互式HTML报告以及一个模拟框架，以可重复地评估组件TE检测器的准确性。当应用于模式微生物真核生物酿酒酵母时，我们发现McClintock 2组件识别非参考TE插入精确位置的能力存在很大差异，RelocaTE2在模拟数据中显示出最高的召回率和精确率。我们发现RelocaTE2、TEMP、TEMP2和TEBreak对每个菌株的约50个非参考TE插入提供了一致的估计，并且在一组约1000个酵母基因组的全物种范围内，Ty2具有最多的非参考TE插入。最后，我们表明，应用于重测序数据的酵母最佳预测器具有足够的分辨率，以揭示酵母tRNA基因上游核小体结合区域中Ty1、Ty2和Ty4的二元整合模式，这使我们能够将先前关于实验诱导的Ty1插入所揭示的精细尺度靶标偏好的知识扩展到酵母中其他考皮亚超家族逆转录转座子的自发插入。

结论

McClintock（https://github.com/bergmanlab/mcclintock/）提供了一个用户友好的管道，用于使用多个TE检测器在短读长全基因组测序数据中鉴定TE，这将使研究广泛不同生物体中TE插入变异的研究人员受益。将改进后的McClintock系统应用于模拟和实证酵母基因组数据，揭示了对研究最广泛的模式真核生物之一的最佳方法和新的生物学见解，并为评估和选择其他物种中的非参考TE检测器提供了范例。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f9f/10347736/24b3db000d19/13100_2023_296_Fig1_HTML.jpg

相似文献

Reproducible evaluation of transposable element detectors with McClintock 2 guides accurate inference of Ty insertion patterns in yeast.

Mob DNA. 2023 Jul 14;14(1):8. doi: 10.1186/s13100-023-00296-4.

Reproducible evaluation of transposable element detectors with McClintock 2 guides accurate inference of Ty insertion patterns in yeast.

bioRxiv. 2023 Mar 21:2023.02.13.528343. doi: 10.1101/2023.02.13.528343.

McClintock: An Integrated Pipeline for Detecting Transposable Element Insertions in Whole-Genome Shotgun Sequencing Data.

G3 (Bethesda). 2017 Aug 7;7(8):2763-2778. doi: 10.1534/g3.117.043893.

RelocaTE2: a high resolution transposable element insertion site mapping tool for population resequencing.

PeerJ. 2017 Jan 26;5:e2942. doi: 10.7717/peerj.2942. eCollection 2017.

Evolutionary genomics of transposable elements in Saccharomyces cerevisiae.

PLoS One. 2012;7(11):e50978. doi: 10.1371/journal.pone.0050978. Epub 2012 Nov 30.

T-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data.

Bioinformatics. 2020 Feb 15;36(4):1191-1197. doi: 10.1093/bioinformatics/btz727.

Transposable elements and genome organization: a comprehensive survey of retrotransposons revealed by the complete Saccharomyces cerevisiae genome sequence.

Genome Res. 1998 May;8(5):464-78. doi: 10.1101/gr.8.5.464.

Targeted identification of TE insertions in a genome through hemi-specific PCR.

Mob DNA. 2017 Jul 28;8:10. doi: 10.1186/s13100-017-0092-1. eCollection 2017.

Ty1-copia elements reveal diverse insertion sites linked to polymorphisms among flax (Linum usitatissimum L.) accessions.

BMC Genomics. 2016 Dec 7;17(1):1002. doi: 10.1186/s12864-016-3337-3.

Comparative genomics and evolutionary dynamics of Saccharomyces cerevisiae Ty elements.

Genetica. 1999;107(1-3):3-13.

引用本文的文献

Leveraging long-read assemblies and machine learning to enhance short-read transposable element detection and genotyping.

bioRxiv. 2025 Feb 16:2025.02.11.637720. doi: 10.1101/2025.02.11.637720.

Horizontal Transfer and Recombination Fuel Ty4 Retrotransposon Evolution in Saccharomyces.

Genome Biol Evol. 2025 Jan 6;17(1). doi: 10.1093/gbe/evaf004.

A unified framework to analyze transposable element insertion polymorphisms using graph genomes.

Nat Commun. 2024 Oct 16;15(1):8915. doi: 10.1038/s41467-024-53294-2.

Evolution of a Restriction Factor by Domestication of a Yeast Retrotransposon.

Mol Biol Evol. 2024 Mar 1;41(3). doi: 10.1093/molbev/msae050.

Variation in mutation, recombination, and transposition rates in and .

Genome Res. 2023 Apr;33(4):587-598. doi: 10.1101/gr.277383.122. Epub 2023 Apr 10.

本文引用的文献

Paths to adaptation under fluctuating nitrogen starvation: The spectrum of adaptive mutations in Saccharomyces cerevisiae is shaped by retrotransposons and microhomology-mediated recombination.

PLoS Genet. 2023 May 16;19(5):e1010747. doi: 10.1371/journal.pgen.1010747. eCollection 2023 May.

Structural basis of Ty1 integrase tethering to RNA polymerase III for targeted retrotransposon integration.

Nat Commun. 2023 Mar 28;14(1):1729. doi: 10.1038/s41467-023-37109-4.

Caffeine-tolerant mutations selected through an at-home yeast experimental evolution teaching lab.

MicroPubl Biol. 2023 Feb 9;2023. doi: 10.17912/micropub.biology.000749. eCollection 2023.

Stage-specific transposon activity in the life cycle of the fairy-ring mushroom .

Proc Natl Acad Sci U S A. 2022 Nov 16;119(46):e2208575119. doi: 10.1073/pnas.2208575119. Epub 2022 Nov 7.

Multiple origins, one evolutionary trajectory: gradual evolution characterizes distinct lineages of allotetraploid Brachypodium.

Genetics. 2023 Feb 9;223(2). doi: 10.1093/genetics/iyac146.

yEvo: experimental evolution in high school classrooms selects for novel mutations that impact clotrimazole resistance in Saccharomyces cerevisiae.

G3 (Bethesda). 2022 Nov 4;12(11). doi: 10.1093/g3journal/jkac246.

Local assembly of long reads enables phylogenomics of transposable elements in a polyploid cell line.

Nucleic Acids Res. 2022 Nov 28;50(21):e124. doi: 10.1093/nar/gkac794.

Bergerac strains of Caenorhabditis elegans revisited: expansion of Tc1 elements imposes a significant genomic and fitness cost.

G3 (Bethesda). 2022 Nov 4;12(11). doi: 10.1093/g3journal/jkac214.

Exploring transposable element-based markers to identify allelic variations underlying agronomic traits in rice.

Plant Commun. 2022 May 9;3(3):100270. doi: 10.1016/j.xplc.2021.100270. Epub 2021 Dec 20.

Ongoing transposition in cell culture reveals the phylogeny of diverse Drosophila S2 sublines.

Genetics. 2022 Jul 4;221(3). doi: 10.1093/genetics/iyac077.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用麦克林托克2对转座元件检测器进行可重复评估，可准确推断酵母中Ty插入模式。

Reproducible evaluation of transposable element detectors with McClintock 2 guides accurate inference of Ty insertion patterns in yeast.

作者信息

Chen Jingxuan, Basting Preston J, Han Shunhua, Garfinkel David J, Bergman Casey M

机构信息

Institute of Bioinformatics, University of Georgia, Athens, GA, USA.

Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, USA.