• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GCI:用于完整基因组组装的连续性检查器。

GCI: a continuity inspector for complete genome assembly.

机构信息

International Institutes of Medicine, The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 322000, China.

Center for Evolutionary & Organismal Biology, Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou 311121, China.

出版信息

Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae633.

DOI:10.1093/bioinformatics/btae633
PMID:39432569
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11550331/
Abstract

MOTIVATION

Recent advances in long-read sequencing technologies have significantly facilitated the production of high-quality genome assembly. The telomere-to-telomere (T2T) gapless assembly has become the new golden standard of genome assembly efforts. Several recent efforts have claimed to produce T2T-level reference genomes. However, a universal standard is still missing to qualify a genome assembly to be at T2T standard. Traditional genome assembly assessment metrics (N50 and its derivatives) have no capacity in differentiating between nearly T2T assembly and the truly T2T assembly in continuity either globally or locally. Additionally, these metrics are independent of raw reads, making them inflated easily by artificial operations. Therefore, a gaplessness evaluation tool at single-nucleotide resolution to reflect true completeness is urgently needed in the era of complete genomes.

RESULTS

Here, we present a tool called Genome Continuity Inspector (GCI), designed to assess genome assembly continuity at single-base resolution, and evaluate how close an assembly is to the T2T level. GCI utilizes multiple aligners to map long reads from various sequencing platforms back to the assembly. By incorporating curated mapping coverage of high-confidence read alignments, GCI identifies potential assembly issues. Meanwhile, it provides GCI scores that quantify overall assembly continuity on the whole genome or chromosome scales.

AVAILABILITY AND IMPLEMENTATION

The open-source GCI code is freely available on Github (https://github.com/yeeus/GCI) under the MIT license.

摘要

动机

近年来,长读测序技术的进步极大地促进了高质量基因组组装的产生。端粒到端粒(T2T)无间隙组装已成为基因组组装工作的新标准。最近有几项研究声称已经产生了 T2T 水平的参考基因组。然而,仍然缺乏一个通用标准来确定基因组组装是否达到 T2T 标准。传统的基因组组装评估指标(N50 及其衍生指标)在全球或局部范围内都没有能力区分几乎达到 T2T 组装和真正的 T2T 连续性组装。此外,这些指标与原始读数无关,因此很容易被人为操作夸大。因此,在完整基因组时代,迫切需要一种在单核苷酸分辨率下评估无间隙性的工具,以反映真实的完整性。

结果

在这里,我们介绍了一种名为基因组连续性检查器(GCI)的工具,用于评估基因组组装在单碱基分辨率下的连续性,并评估组装与 T2T 水平的接近程度。GCI 利用多个比对器将来自各种测序平台的长读序列映射回组装。通过整合高可信度读对齐的精心策划的映射覆盖率,GCI 可以识别潜在的组装问题。同时,它还提供了 GCI 评分,用于量化整个基因组或染色体尺度上的组装连续性。

可用性和实现

开源的 GCI 代码可在 Github(https://github.com/yeeus/GCI)上免费获取,遵循 MIT 许可证。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1df4/11550331/87577d7291b7/btae633f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1df4/11550331/b4144ac8d9bd/btae633f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1df4/11550331/d1ca9ec445bb/btae633f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1df4/11550331/394fe7b10f2f/btae633f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1df4/11550331/87577d7291b7/btae633f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1df4/11550331/b4144ac8d9bd/btae633f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1df4/11550331/d1ca9ec445bb/btae633f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1df4/11550331/394fe7b10f2f/btae633f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1df4/11550331/87577d7291b7/btae633f4.jpg

相似文献

1
GCI: a continuity inspector for complete genome assembly.GCI:用于完整基因组组装的连续性检查器。
Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae633.
2
NextPolish2: A Repeat-aware Polishing Tool for Genomes Assembled Using HiFi Long Reads.NextPolish2:一种针对使用 HiFi 长读长组装的基因组进行重复感知优化的工具。
Genomics Proteomics Bioinformatics. 2024 May 9;22(1). doi: 10.1093/gpbjnl/qzad009.
3
scanPAV: a pipeline for extracting presence-absence variations in genome pairs.scanPAV:用于提取基因组对中存在-缺失变异的管道。
Bioinformatics. 2018 Sep 1;34(17):3022-3024. doi: 10.1093/bioinformatics/bty189.
4
ntLink: A Toolkit for De Novo Genome Assembly Scaffolding and Mapping Using Long Reads.ntLink:一种使用长读长进行从头基因组组装支架和映射的工具包。
Curr Protoc. 2023 Apr;3(4):e733. doi: 10.1002/cpz1.733.
5
QuorUM: An Error Corrector for Illumina Reads.QuorUM:Illumina测序读数的纠错工具
PLoS One. 2015 Jun 17;10(6):e0130821. doi: 10.1371/journal.pone.0130821. eCollection 2015.
6
Comparison of Long-Read Methods for Sequencing and Assembly of Lepidopteran Pest Genomes.鳞翅目害虫基因组测序和组装的长读方法比较。
Int J Mol Sci. 2022 Dec 30;24(1):649. doi: 10.3390/ijms24010649.
7
GAEP: a comprehensive genome assembly evaluating pipeline.GAEP:一个全面的基因组组装评估管道。
J Genet Genomics. 2023 Oct;50(10):747-754. doi: 10.1016/j.jgg.2023.05.009. Epub 2023 May 26.
8
Era of gapless plant genomes: innovations in sequencing and mapping technologies revolutionize genomics and breeding.无间隙植物基因组时代:测序和作图技术的创新变革基因组学与育种。
Curr Opin Biotechnol. 2023 Feb;79:102886. doi: 10.1016/j.copbio.2022.102886. Epub 2023 Jan 12.
9
Nanopore ultra-long sequencing and adaptive sampling spur plant complete telomere-to-telomere genome assembly.纳米孔超长测序和自适应采样促进植物完成端粒到端粒的全基因组组装。
Mol Plant. 2024 Nov 4;17(11):1773-1786. doi: 10.1016/j.molp.2024.10.008. Epub 2024 Oct 16.
10
NucBreak: location of structural errors in a genome assembly by using paired-end Illumina reads.NucBreak:利用 Illumina 配对末端读取来定位基因组组装中的结构错误。
BMC Bioinformatics. 2020 Feb 21;21(1):66. doi: 10.1186/s12859-020-3414-0.

引用本文的文献

1
Whole-Genome Sequencing and Biosynthetic Gene Cluster Analysis of Novel Entomopathogenic Bacteria ALN 7.1 and ALN 11.5.新型昆虫病原细菌ALN 7.1和ALN 11.5的全基因组测序与生物合成基因簇分析
Biology (Basel). 2025 Jul 22;14(8):905. doi: 10.3390/biology14080905.
2
A telomere-to-telomere genome assembly of koi carp (Cyprinus carpio) using long reads and Hi-C technology.利用长读长和Hi-C技术对锦鲤(Cyprinus carpio)进行端粒到端粒的基因组组装。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf087.
3
The telomere-to-telomere gapless genome of grass carp provides insights for genetic improvement.

本文引用的文献

1
Identification of errors in draft genome assemblies at single-nucleotide resolution for quality assessment and improvement.以单核苷酸分辨率识别基因组草案组装中的错误,以进行质量评估和改进。
Nat Commun. 2023 Oct 17;14(1):6556. doi: 10.1038/s41467-023-42336-w.
2
The complete sequence of a human Y chromosome.人类 Y 染色体的完整序列。
Nature. 2023 Sep;621(7978):344-354. doi: 10.1038/s41586-023-06457-y. Epub 2023 Aug 23.
3
The complete and fully-phased diploid genome of a male Han Chinese.一位男性汉族个体的完整、全面二倍体基因组。
草鱼的端粒到端粒无间隙基因组为遗传改良提供了见解。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf059.
4
Near telomere-to-telomere genome assembly of the blackspot tuskfish (Choerodon schoenleinii).黑斑猪齿鱼(Choerodon schoenleinii)近乎端粒到端粒的基因组组装
Sci Data. 2025 Mar 31;12(1):537. doi: 10.1038/s41597-025-04893-1.
5
Chromosome-scale and haplotype-resolved genome assembly of .……的染色体级别和单倍型解析基因组组装 。 你提供的原文不完整,请补充完整内容以便我能更准确地翻译。
Hortic Res. 2025 Jan 15;12(4):uhaf012. doi: 10.1093/hr/uhaf012. eCollection 2025 Apr.
Cell Res. 2023 Oct;33(10):745-761. doi: 10.1038/s41422-023-00849-5. Epub 2023 Jul 14.
4
A draft human pangenome reference.人类泛基因组参考草图。
Nature. 2023 May;617(7960):312-324. doi: 10.1038/s41586-023-05896-x. Epub 2023 May 10.
5
Evolutionary analysis of a complete chicken genome.鸡全基因组的进化分析。
Proc Natl Acad Sci U S A. 2023 Feb 21;120(8):e2216641120. doi: 10.1073/pnas.2216641120. Epub 2023 Feb 13.
6
A proposed metric set for evaluation of genome assembly quality.一套用于评估基因组组装质量的提议指标集。
Trends Genet. 2023 Mar;39(3):175-186. doi: 10.1016/j.tig.2022.10.005. Epub 2022 Nov 17.
7
Fast and accurate mapping of long reads to complete genome assemblies with VerityMap.使用 VerityMap 快速准确地将长读段映射到完整基因组组装上。
Genome Res. 2022 Nov-Dec;32(11-12):2107-2118. doi: 10.1101/gr.276871.122. Epub 2022 Nov 15.
8
Semi-automated assembly of high-quality diploid human reference genomes.半自动组装高质量的二倍体人类参考基因组。
Nature. 2022 Nov;611(7936):519-531. doi: 10.1038/s41586-022-05325-5. Epub 2022 Oct 19.
9
Long-read mapping to repetitive reference sequences using Winnowmap2.使用Winnowmap2将长读段映射到重复参考序列。
Nat Methods. 2022 Jun;19(6):705-710. doi: 10.1038/s41592-022-01457-8. Epub 2022 Apr 1.
10
Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies.追求完美:端粒到端粒基因组组装的验证和优化策略。
Nat Methods. 2022 Jun;19(6):687-695. doi: 10.1038/s41592-022-01440-3. Epub 2022 Mar 31.