• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种管道式纳米TRF作为在植物基因组原始纳米孔测序读数中鉴定卫星DNA的新工具。

A Pipeline NanoTRF as a New Tool for Satellite DNA Identification in the Raw Nanopore Sequencing Reads of Plant Genomes.

作者信息

Kirov Ilya, Kolganova Elizaveta, Dudnikov Maxim, Yurkevich Olga Yu, Amosova Alexandra V, Muravenko Olga V

机构信息

All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, Moscow 127550, Russia.

Moscow Institute of Physics and Technology, Dolgoprudny 141701, Russia.

出版信息

Plants (Basel). 2022 Aug 12;11(16):2103. doi: 10.3390/plants11162103.

DOI:10.3390/plants11162103
PMID:36015406
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9413040/
Abstract

High-copy tandemly organized repeats (TRs), or satellite DNA, is an important but still enigmatic component of eukaryotic genomes. TRs comprise arrays of multi-copy and highly similar tandem repeats, which makes the elucidation of TRs a very challenging task. Oxford Nanopore sequencing data provide a valuable source of information on TR organization at the single molecule level. However, bioinformatics tools for de novo identification of TRs in raw Nanopore data have not been reported so far. We developed NanoTRF, a new python pipeline for TR repeat identification, characterization and consensus monomer sequence assembly. This new pipeline requires only a raw Nanopore read file from low-depth (<1×) genome sequencing. The program generates an informative html report and figures on TR genome abundance, monomer sequence and monomer length. In addition, NanoTRF performs annotation of transposable elements (TEs) sequences within or near satDNA arrays, and the information can be used to elucidate how TR−TE co-evolve in the genome. Moreover, we validated by FISH that the NanoTRF report is useful for the evaluation of TR chromosome organization—clustered or dispersed. Our findings showed that NanoTRF is a robust method for the de novo identification of satellite repeats in raw Nanopore data without prior read assembly. The obtained sequences can be used in many downstream analyses including genome assembly assistance and gap estimation, chromosome mapping and cytogenetic marker development.

摘要

高拷贝串联重复序列(TRs),即卫星DNA,是真核生物基因组的一个重要但仍神秘的组成部分。TRs由多拷贝且高度相似的串联重复序列阵列组成,这使得阐明TRs成为一项极具挑战性的任务。牛津纳米孔测序数据提供了关于单分子水平TR组织的宝贵信息来源。然而,目前尚未报道用于从原始纳米孔数据中从头识别TRs的生物信息学工具。我们开发了NanoTRF,这是一种用于TR重复序列识别、表征和共有单体序列组装的新Python管道。这个新管道只需要来自低深度(<1×)基因组测序的原始纳米孔读取文件。该程序生成一份关于TR基因组丰度、单体序列和单体长度的信息丰富的html报告和图表。此外,NanoTRF对satDNA阵列内或附近的转座元件(TEs)序列进行注释,这些信息可用于阐明TR与TE在基因组中是如何共同进化的。此外,我们通过荧光原位杂交验证了NanoTRF报告对于评估TR染色体组织(聚集或分散)是有用的。我们的研究结果表明,NanoTRF是一种无需预先进行读取组装即可从原始纳米孔数据中从头识别卫星重复序列的强大方法。获得的序列可用于许多下游分析,包括基因组组装辅助和缺口估计、染色体定位以及细胞遗传学标记开发。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0a/9413040/ed7a166113c3/plants-11-02103-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0a/9413040/ea7a17aec9a6/plants-11-02103-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0a/9413040/01efd0f966a7/plants-11-02103-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0a/9413040/46f8fdd1821d/plants-11-02103-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0a/9413040/ed7a166113c3/plants-11-02103-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0a/9413040/ea7a17aec9a6/plants-11-02103-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0a/9413040/01efd0f966a7/plants-11-02103-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0a/9413040/46f8fdd1821d/plants-11-02103-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab0a/9413040/ed7a166113c3/plants-11-02103-g004.jpg

相似文献

1
A Pipeline NanoTRF as a New Tool for Satellite DNA Identification in the Raw Nanopore Sequencing Reads of Plant Genomes.一种管道式纳米TRF作为在植物基因组原始纳米孔测序读数中鉴定卫星DNA的新工具。
Plants (Basel). 2022 Aug 12;11(16):2103. doi: 10.3390/plants11162103.
2
Pipeline for the Rapid Development of Cytogenetic Markers Using Genomic Data of Related Species.利用相关物种的基因组数据快速开发细胞遗传学标记的流程。
Genes (Basel). 2019 Feb 1;10(2):113. doi: 10.3390/genes10020113.
3
De novo identification of satellite DNAs in the sequenced genomes of Drosophila virilis and D. americana using the RepeatExplorer and TAREAN pipelines.利用 RepeatExplorer 和 TAREAN 管道在已测序的黑腹果蝇和美洲果蝇基因组中从头鉴定卫星 DNA。
PLoS One. 2019 Dec 19;14(12):e0223466. doi: 10.1371/journal.pone.0223466. eCollection 2019.
4
Characterization of repeat arrays in ultra-long nanopore reads reveals frequent origin of satellite DNA from retrotransposon-derived tandem repeats.超长纳米孔读取中重复数组的特征分析揭示了卫星 DNA 频繁源自逆转座子衍生的串联重复。
Plant J. 2020 Jan;101(2):484-500. doi: 10.1111/tpj.14546. Epub 2019 Nov 3.
5
De novo Nanopore read quality improvement using deep learning.基于深度学习的从头纳米孔读质量改进。
BMC Bioinformatics. 2019 Nov 6;20(1):552. doi: 10.1186/s12859-019-3103-z.
6
Tandemly repeated DNA families in the mouse genome.小鼠基因组中的串联重复 DNA 家族。
BMC Genomics. 2011 Oct 28;12:531. doi: 10.1186/1471-2164-12-531.
7
Transposons and satellite DNA: on the origin of the major satellite DNA family in the genome.转座子与卫星DNA:关于基因组中主要卫星DNA家族的起源
Mob DNA. 2020 Jun 26;11:20. doi: 10.1186/s13100-020-00219-7. eCollection 2020.
8
Evaluation of strategies for the assembly of diverse bacterial genomes using MinION long-read sequencing.利用 MinION 长读测序技术评估组装多种细菌基因组的策略。
BMC Genomics. 2019 Jan 9;20(1):23. doi: 10.1186/s12864-018-5381-7.
9
Complex sequence organization of heterochromatin in the holocentric plant elucidated by the computational analysis of nanopore reads.通过纳米孔读数的计算分析阐明全着丝粒植物中异染色质的复杂序列组织。
Comput Struct Biotechnol J. 2021 Apr 22;19:2179-2189. doi: 10.1016/j.csbj.2021.04.011. eCollection 2021.
10
Nanopore sequencing and full genome de novo assembly of human cytomegalovirus TB40/E reveals clonal diversity and structural variations.纳米孔测序和人类巨细胞病毒 TB40/E 的全基因组从头组装揭示了克隆多样性和结构变异。
BMC Genomics. 2018 Aug 2;19(1):577. doi: 10.1186/s12864-018-4949-6.

引用本文的文献

1
SatXplor-a comprehensive pipeline for satellite DNA analyses in complex genome assemblies.SatXplor——用于复杂基因组组装中卫星DNA分析的综合流程。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae660.
2
Bioinformatics in Russia: history and present-day landscape.俄罗斯的生物信息学:历史与现状
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae513.
3
Genome Studies in Four Species of L. (Asteraceae) Using Satellite DNAs as Chromosome Markers.利用卫星DNA作为染色体标记对菊科莴苣属四个物种进行基因组研究

本文引用的文献

1
Repeatome Analyses and Satellite DNA Chromosome Patterns in , , and (Poaceae).重复序列分析和卫星 DNA 染色体模式在 、 、 和 (禾本科)。
Genes (Basel). 2022 Apr 26;13(5):762. doi: 10.3390/genes13050762.
2
The genetic and epigenetic landscape of the centromeres.着丝粒的遗传和表观遗传景观。
Science. 2021 Nov 12;374(6569):eabi7489. doi: 10.1126/science.abi7489.
3
High-quality Arabidopsis thaliana Genome Assembly with Nanopore and HiFi Long Reads.利用纳米孔和高保真长读长进行高质量拟南芥基因组组装
Plants (Basel). 2023 Dec 2;12(23):4056. doi: 10.3390/plants12234056.
4
Satellite DNAs-From Localized to Highly Dispersed Genome Components.卫星 DNA-从局域化到高度分散的基因组成分。
Genes (Basel). 2023 Mar 17;14(3):742. doi: 10.3390/genes14030742.
5
Telomeres and Their Neighbors.端粒及其相邻区域。
Genes (Basel). 2022 Sep 16;13(9):1663. doi: 10.3390/genes13091663.
Genomics Proteomics Bioinformatics. 2022 Feb;20(1):4-13. doi: 10.1016/j.gpb.2021.08.003. Epub 2021 Sep 3.
4
Cytogenomics of P. Beauv. (Poaceae) Species Based on Sequence Analyses and FISH Mapping of CON/COM Satellite DNA Families.基于CON/COM卫星DNA家族序列分析和荧光原位杂交定位的黍属(禾本科)物种细胞基因组学
Plants (Basel). 2021 May 30;10(6):1105. doi: 10.3390/plants10061105.
5
Detection of Tandem Repeats in DNA Sequences Using Short-Time Ramanujan Fourier Transform.利用短时间 Ramanujan 傅里叶变换检测 DNA 序列中的串联重复。
IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1583-1591. doi: 10.1109/TCBB.2021.3053656. Epub 2022 Jun 3.
6
Telomere-to-telomere assembly of a complete human X chromosome.端粒到端粒组装完整的人类 X 染色体。
Nature. 2020 Sep;585(7823):79-84. doi: 10.1038/s41586-020-2547-7. Epub 2020 Jul 14.
7
The string decomposition problem and its applications to centromere analysis and assembly.字符串分解问题及其在着丝粒分析和组装中的应用。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i93-i101. doi: 10.1093/bioinformatics/btaa454.
8
Functional Significance of Satellite DNAs: Insights From .卫星DNA的功能意义:来自……的见解
Front Cell Dev Biol. 2020 May 5;8:312. doi: 10.3389/fcell.2020.00312. eCollection 2020.
9
What makes a centromere?着丝粒由什么构成?
Exp Cell Res. 2020 Apr 15;389(2):111895. doi: 10.1016/j.yexcr.2020.111895. Epub 2020 Feb 6.
10
Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases.串联重复导致序列组装错误,并对基因组和蛋白质数据库提出了多层次的挑战。
Nucleic Acids Res. 2019 Dec 2;47(21):10994-11006. doi: 10.1093/nar/gkz841.