• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

TIDK:一种从基因组数据集中快速识别端粒重复序列的工具包。

tidk: a toolkit to rapidly identify telomeric repeats from genomic datasets.

作者信息

Brown Max R, Manuel Gonzalez de La Rosa Pablo, Blaxter Mark

机构信息

School of Life Sciences, Anglia Ruskin University, Cambridge, CB1 1PT, United Kingdom.

Tree of Life, Wellcome Sanger Institute, Hinxton, CB10 1RQ, United Kingdom.

出版信息

Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf049.

DOI:10.1093/bioinformatics/btaf049
PMID:39891350
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11814493/
Abstract

SUMMARY

"tidk" (short for telomere identification toolkit) uses a simple, fast algorithm to scan long DNA reads for the presence of short tandemly repeated DNA in runs, and to aggregate them based on canonical DNA string representation. These are telomeric repeat candidates. Our algorithm is shown to be accurate in genomes for which the telomeric repeat unit is known and is tested across a wide variety of newly assembled genomes to uncover new telomeric repeat units. Tools are provided to identify telomeric repeats de novo, scan genomes for known telomeric repeats, and to visualize telomeric repeats on the assembly. "tidk" is implemented in Rust and is available as a command line tool which can be compiled using the Rust toolchain or downloaded as a binary from bioconda.

AVAILABILITY AND IMPLEMENTATION

The "tidk" Rust crate is freely available under the MIT license (https://crates.io/crates/tidk), and the source code is available at https://github.com/tolkit/telomeric-identifier.

摘要

摘要

“tidk”(端粒识别工具包的缩写)使用一种简单、快速的算法来扫描长DNA读段,以查找连续短串联重复DNA的存在,并根据标准DNA字符串表示对它们进行汇总。这些是端粒重复候选序列。我们的算法在已知端粒重复单元的基因组中被证明是准确的,并在各种新组装的基因组上进行了测试,以发现新的端粒重复单元。提供了从头识别端粒重复序列、在基因组中扫描已知端粒重复序列以及在组装结果上可视化端粒重复序列的工具。“tidk”用Rust实现,作为命令行工具可用,可使用Rust工具链进行编译,或从bioconda作为二进制文件下载。

可用性和实现方式

“tidk”Rust包在MIT许可下免费提供(https://crates.io/crates/tidk),源代码可在https://github.com/tolkit/telomeric-identifier获取。

相似文献

1
tidk: a toolkit to rapidly identify telomeric repeats from genomic datasets.TIDK:一种从基因组数据集中快速识别端粒重复序列的工具包。
Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf049.
2
GAPPadder: a sensitive approach for closing gaps on draft genomes with short sequence reads.GAPPadder:一种使用短序列读长来闭合草图基因组缺口的灵敏方法。
BMC Genomics. 2019 Jun 6;20(Suppl 5):426. doi: 10.1186/s12864-019-5703-4.
3
PERF: an exhaustive algorithm for ultra-fast and efficient identification of microsatellites from large DNA sequences.PERF:一种从大型 DNA 序列中进行超快速和高效微卫星识别的穷举算法。
Bioinformatics. 2018 Mar 15;34(6):943-948. doi: 10.1093/bioinformatics/btx721.
4
RepLong: de novo repeat identification using long read sequencing data.RepLong:利用长读测序数据进行从头重复识别。
Bioinformatics. 2018 Apr 1;34(7):1099-1107. doi: 10.1093/bioinformatics/btx717.
5
Repeat-aware evaluation of scaffolding tools.重复感知的支架工具评估。
Bioinformatics. 2018 Aug 1;34(15):2530-2537. doi: 10.1093/bioinformatics/bty131.
6
An improved approach for reconstructing consensus repeats from short sequence reads.一种从短序列读段中重构一致重复序列的改进方法。
BMC Genomics. 2018 Aug 13;19(Suppl 6):566. doi: 10.1186/s12864-018-4920-6.
7
NextPolish2: A Repeat-aware Polishing Tool for Genomes Assembled Using HiFi Long Reads.NextPolish2:一种针对使用 HiFi 长读长组装的基因组进行重复感知优化的工具。
Genomics Proteomics Bioinformatics. 2024 May 9;22(1). doi: 10.1093/gpbjnl/qzad009.
8
PILER-CR: fast and accurate identification of CRISPR repeats.PILER-CR:快速准确地识别CRISPR重复序列。
BMC Bioinformatics. 2007 Jan 20;8:18. doi: 10.1186/1471-2105-8-18.
9
Illumina error correction near highly repetitive DNA regions improves de novo genome assembly.Illumina 纠错技术在高度重复 DNA 区域的应用提高了从头基因组组装的质量。
BMC Bioinformatics. 2019 Jun 3;20(1):298. doi: 10.1186/s12859-019-2906-2.
10
FinisherSC: a repeat-aware tool for upgrading de novo assembly using long reads.FinisherSC:一种使用长读长进行从头组装升级的重复感知工具。
Bioinformatics. 2015 Oct 1;31(19):3207-9. doi: 10.1093/bioinformatics/btv280. Epub 2015 Jun 3.

引用本文的文献

1
variants drive chromosomal fission and accelerate speciation in zokors.变异驱动鼢鼠的染色体裂变并加速物种形成。
Sci Adv. 2025 Sep 5;11(36):eadt2282. doi: 10.1126/sciadv.adt2282.
2
Chromosome-level genome assembly of the Tyrrhenian tree frog (Hyla sarda).第勒尼安树蛙(Hyla sarda)的染色体水平基因组组装
Sci Data. 2025 Sep 2;12(1):1539. doi: 10.1038/s41597-025-05760-9.
3
The near-complete genome assembly of provides insights into its origin, evolution, and the regulation of flavonoid biosynthesis.[具体物种名称]近乎完整的基因组组装为其起源、进化以及类黄酮生物合成的调控提供了见解。

本文引用的文献

1
Chromosome-Level Genome Assembly and Annotation of a Periodical Cicada Species: Magicicada septendecula.十七年蝉种的染色体水平基因组组装和注释。
Genome Biol Evol. 2024 Jan 5;16(1). doi: 10.1093/gbe/evae001.
2
The genome sequence of the Forest Cuckoo Bee, (Lepeletier, 1832).森林杜鹃蜂(Lepeletier,1832年)的基因组序列。
Wellcome Open Res. 2023 Feb 15;8:78. doi: 10.12688/wellcomeopenres.18986.1. eCollection 2023.
3
Telomere-to-telomere and haplotype-resolved genome of the kiwifruit Actinidia eriantha.中华猕猴桃的端粒到端粒及单倍型解析基因组
Front Plant Sci. 2025 Aug 11;16:1580779. doi: 10.3389/fpls.2025.1580779. eCollection 2025.
4
Evolutionary Consequences of Unusually Large Pericentric TE-rich Regions in the Genome of a Neotropical Fig Wasp.新热带区榕小蜂基因组中异常大的富含着丝粒转座元件区域的进化后果
Genome Biol Evol. 2025 Sep 2;17(9). doi: 10.1093/gbe/evaf158.
5
Telomere-to-telomere African wild rice (Oryza longistaminata) reference genome reveals segmental and structural variation.端粒到端粒的非洲野生稻(长雄蕊野生稻)参考基因组揭示了片段和结构变异。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf074.
6
Chromosome-level phased genome assembly of the argan tree Sideroxylon spinosum.刺阿甘树(Sideroxylon spinosum)的染色体水平分阶段基因组组装
Sci Data. 2025 Aug 15;12(1):1430. doi: 10.1038/s41597-025-05768-1.
7
Near telomere-to-telomere genome assembly of Camellia pitardii.毛籽金花茶的近端粒到端粒基因组组装
Sci Data. 2025 Aug 14;12(1):1422. doi: 10.1038/s41597-025-05764-5.
8
A telomere-to-telomere genome of wild soybean with resistance to soybean cyst nematode X12.对大豆胞囊线虫X12具有抗性的野生大豆的端粒到端粒基因组。
Sci Data. 2025 Aug 13;12(1):1412. doi: 10.1038/s41597-025-05741-y.
9
Topsicle: a method for estimating telomere length from whole genome long-read sequencing data.Topsicle:一种从全基因组长读测序数据估计端粒长度的方法。
bioRxiv. 2025 Jul 15:2025.07.10.664126. doi: 10.1101/2025.07.10.664126.
10
Draft genome sequence of strain P18 isolated from cattle in Japan.从日本牛身上分离出的P18菌株的基因组序列草图。
Microbiol Resour Announc. 2025 Sep 11;14(9):e0054425. doi: 10.1128/mra.00544-25. Epub 2025 Jul 31.
Mol Hortic. 2023 Feb 17;3(1):4. doi: 10.1186/s43897-023-00052-5.
4
genome assembly of , a biocontrol agent of insect agricultural pests.一种昆虫农业害虫生物防治剂的基因组组装 。 (原文句子不完整,此为尽力贴合原意的翻译)
Access Microbiol. 2023 Jun 12;5(6). doi: 10.1099/acmi.0.000568.v3. eCollection 2023.
5
The complete reference genome for grapevine ( L.) genetics and breeding.葡萄(L.)遗传学与育种的完整参考基因组。
Hortic Res. 2023 Apr 4;10(5):uhad061. doi: 10.1093/hr/uhad061. eCollection 2023 May.
6
Gapless genome assembly of East Asian finless porpoise.东亚江豚无间隙基因组组装。
Sci Data. 2022 Dec 13;9(1):765. doi: 10.1038/s41597-022-01868-4.
7
A Genome Sequence Assembly of the Phototactic and Optogenetic Model Fungus Blastocladiella emersonii Reveals a Diversified Nucleotide-Cyclase Repertoire.一个光趋性和光遗传学模型真菌——埃默森被孢霉的基因组序列组装揭示了多样化的核苷酸环化酶库。
Genome Biol Evol. 2022 Dec 8;14(12). doi: 10.1093/gbe/evac157.
8
Sequence locally, think globally: The Darwin Tree of Life Project.就地测序,放眼全球:达尔文生命之树计划。
Proc Natl Acad Sci U S A. 2022 Jan 25;119(4). doi: 10.1073/pnas.2115642118.
9
Origin, Diversity, and Evolution of Telomere Sequences in Plants.植物端粒序列的起源、多样性及进化
Front Plant Sci. 2020 Feb 21;11:117. doi: 10.3389/fpls.2020.00117. eCollection 2020.
10
TelomereHunter - in silico estimation of telomere content and composition from cancer genomes.TelomereHunter-从癌症基因组中估算端粒含量和组成的计算工具。
BMC Bioinformatics. 2019 May 28;20(1):272. doi: 10.1186/s12859-019-2851-0.