• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

分阶段基因组组装。

Phased Genome Assemblies.

机构信息

Systems and Computing Engineering Department, Universidad de los Andes, Bogotá, Colombia.

出版信息

Methods Mol Biol. 2023;2590:273-286. doi: 10.1007/978-1-0716-2819-5_16.

DOI:10.1007/978-1-0716-2819-5_16
PMID:36335504
Abstract

The ultimate goal of de novo assembly of reads sequenced from a diploid individual is the separate reconstruction of the sequences corresponding to the two copies of each chromosome. Unfortunately, the allele linkage information needed to perform phased genome assemblies has been difficult to generate. Hence, most current genome assemblies are a haploid mixture of the two underlying chromosome copies present in the sequenced individual. Sequencing technologies providing long (20 kb) and accurate reads are the basis to generate phased genome assemblies. This chapter provides a brief overview of the main milestones in traditional genome assembly, focusing on the bioinformatic techniques developed to generate haplotype information from different specialized protocols. Using these techniques as a knowledge background, the chapter reviews the current algorithms to generate phased assemblies from long reads with low error rates. Current techniques perform haplotype-aware error correction steps to increase the quality of the raw reads. In addition, variations on the traditional overlap-layout-consensus (OLC) graph have been developed in an effort to eliminate edges between reads sequenced from different chromosome copies. This allows for large presence-absence variants between the chromosome copies to be taken into account. The development of these algorithms, along with the improved sequencing technologies has been crucial to finish chromosome-level assemblies of complex genomes.

摘要

从二倍体个体中测序得到的reads 进行从头组装的最终目标是分别重建每个染色体的两个拷贝对应的序列。不幸的是,进行相组装所需的等位基因连锁信息很难生成。因此,大多数当前的基因组组装是测序个体中存在的两个潜在染色体拷贝的单倍体混合物。提供长(20kb)和准确读取的测序技术是生成相基因组组装的基础。本章简要概述了传统基因组组装的主要里程碑,重点介绍了为从不同专门协议生成单倍型信息而开发的生物信息学技术。使用这些技术作为知识背景,本章回顾了从具有低错误率的长reads 生成相组装的当前算法。当前的技术执行单倍型感知纠错步骤以提高原始reads 的质量。此外,还开发了传统重叠布局共识(OLC)图的变体,以努力消除来自不同染色体拷贝的reads 之间的边缘。这允许考虑染色体拷贝之间的大存在缺失变体。这些算法的发展以及改进的测序技术对于完成复杂基因组的染色体水平组装至关重要。

相似文献

1
Phased Genome Assemblies.分阶段基因组组装。
Methods Mol Biol. 2023;2590:273-286. doi: 10.1007/978-1-0716-2819-5_16.
2
Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads.利用单细胞测序和长读长技术进行全相基因组组装,无需父母数据。
Nat Biotechnol. 2021 Mar;39(3):302-308. doi: 10.1038/s41587-020-0719-5. Epub 2020 Dec 7.
3
De novo diploid genome assembly using long noisy reads.从头组装具有长噪声读长的二倍体基因组。
Nat Commun. 2024 Apr 5;15(1):2964. doi: 10.1038/s41467-024-47349-7.
4
New algorithms for accurate and efficient de novo genome assembly from long DNA sequencing reads.新算法可从长 DNA 测序读取中实现准确高效的从头基因组组装。
Life Sci Alliance. 2023 Feb 22;6(5). doi: 10.26508/lsa.202201719. Print 2023 May.
5
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致(OLC)方法的最佳性能。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.
6
Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm.使用带有 hifiasm 的相定装配图进行单体型解析从头组装。
Nat Methods. 2021 Feb;18(2):170-175. doi: 10.1038/s41592-020-01056-5. Epub 2021 Feb 1.
7
HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.HiCanu:从高保真长读段中精确组装片段重复、卫星和等位基因变体。
Genome Res. 2020 Sep;30(9):1291-1305. doi: 10.1101/gr.263566.120. Epub 2020 Aug 14.
8
HapCompass: a fast cycle basis algorithm for accurate haplotype assembly of sequence data.HapCompass:一种用于准确组装序列数据单倍型的快速循环基算法。
J Comput Biol. 2012 Jun;19(6):577-90. doi: 10.1089/cmb.2012.0084.
9
Genome sequence assembly algorithms and misassembly identification methods.基因组序列组装算法和错误组装识别方法。
Mol Biol Rep. 2022 Nov;49(11):11133-11148. doi: 10.1007/s11033-022-07919-8. Epub 2022 Sep 23.
10
phasebook: haplotype-aware de novo assembly of diploid genomes from long reads.相位图:基于长读长的二倍体基因组单体型感知从头组装
Genome Biol. 2021 Oct 27;22(1):299. doi: 10.1186/s13059-021-02512-x.

引用本文的文献

1
A phased chromosome-level genome of the annelid tubeworm Galeolaria caespitosa.环节动物管栖蠕虫丛生艾氏岩蠍的阶段性染色体水平基因组。
J Hered. 2025 Aug 23;116(5):702-712. doi: 10.1093/jhered/esaf025.
2
Establishing genome sequencing and assembly for non-model and emerging model organisms: a brief guide.为非模式生物和新兴模式生物建立基因组测序与组装:简要指南
Front Zool. 2025 Apr 17;22(1):7. doi: 10.1186/s12983-025-00561-7.

本文引用的文献

1
Representation and participation across 20 years of plant genome sequencing.二十年来植物基因组测序的表现与参与。
Nat Plants. 2021 Dec;7(12):1571-1578. doi: 10.1038/s41477-021-01031-8. Epub 2021 Nov 29.
2
Unraveling the Genome of a High Yielding Colombian Sugarcane Hybrid.解析高产哥伦比亚甘蔗杂交种的基因组
Front Plant Sci. 2021 Aug 13;12:694859. doi: 10.3389/fpls.2021.694859. eCollection 2021.
3
Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm.使用带有 hifiasm 的相定装配图进行单体型解析从头组装。
Nat Methods. 2021 Feb;18(2):170-175. doi: 10.1038/s41592-020-01056-5. Epub 2021 Feb 1.
4
Efficient assembly of nanopore reads via highly accurate and intact error correction.通过高度准确和完整的纠错实现纳米孔读取的高效组装。
Nat Commun. 2021 Jan 4;12(1):60. doi: 10.1038/s41467-020-20236-7.
5
Gamete binning: chromosome-level and haplotype-resolved genome assembly enabled by high-throughput single-cell sequencing of gamete genomes.配子-bin 分析:通过高通量单细胞配子基因组测序实现染色体水平和单倍型分辨率的基因组组装。
Genome Biol. 2020 Dec 29;21(1):306. doi: 10.1186/s13059-020-02235-5.
6
Highly accurate long-read HiFi sequencing data for five complex genomes.针对五个复杂基因组的高度精确的长读长HiFi测序数据。
Sci Data. 2020 Nov 17;7(1):399. doi: 10.1038/s41597-020-00743-4.
7
HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.HiCanu:从高保真长读段中精确组装片段重复、卫星和等位基因变体。
Genome Res. 2020 Sep;30(9):1291-1305. doi: 10.1101/gr.263566.120. Epub 2020 Aug 14.
8
Identifying and removing haplotypic duplication in primary genome assemblies.鉴定和去除初级基因组组装中的单倍型重复。
Bioinformatics. 2020 May 1;36(9):2896-2898. doi: 10.1093/bioinformatics/btaa025.
9
High throughput barcoding method for genome-scale phasing.高通量条码化方法用于基因组规模的相位测定。
Sci Rep. 2019 Dec 2;9(1):18116. doi: 10.1038/s41598-019-54446-x.
10
Accurate, scalable and integrative haplotype estimation.精确、可扩展且综合的单倍型估计。
Nat Commun. 2019 Nov 28;10(1):5436. doi: 10.1038/s41467-019-13225-y.