利用光学图谱和染色体构象捕获数据改进和校正三种植物物种长读长基因组组装的连续性

Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

作者信息

Jiao Wen-Biao, Accinelli Gonzalo Garcia, Hartwig Benjamin, Kiefer Christiane, Baker David, Severing Edouard, Willing Eva-Maria, Piednoel Mathieu, Woetzel Stefan, Madrid-Herrero Eva, Huettel Bruno, Hümann Ulrike, Reinhard Richard, Koch Marcus A, Swan Daniel, Clavijo Bernardo, Coupland George, Schneeberger Korbinian

机构信息

Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany.

Earlham Institute, Norwich Research Park, Norwich NR4 7UH, United Kingdom.

出版信息

Genome Res. 2017 May;27(5):778-786. doi: 10.1101/gr.213652.116. Epub 2017 Feb 3.

DOI:10.1101/gr.213652.116

PMID:28159771

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5411772/

Abstract

Long-read sequencing can overcome the weaknesses of short reads in the assembly of eukaryotic genomes; however, at present additional scaffolding is needed to achieve chromosome-level assemblies. We generated Pacific Biosciences (PacBio) long-read data of the genomes of three relatives of the model plant and assembled all three genomes into only a few hundred contigs. To improve the contiguities of these assemblies, we generated BioNano Genomics optical mapping and Dovetail Genomics chromosome conformation capture data for genome scaffolding. Despite their technical differences, optical mapping and chromosome conformation capture performed similarly and doubled N50 values. After improving both integration methods, assembly contiguity reached chromosome-arm-levels. We rigorously assessed the quality of contigs and scaffolds using Illumina mate-pair libraries and genetic map information. This showed that PacBio assemblies have high sequence accuracy but can contain several misassemblies, which join unlinked regions of the genome. Most, but not all, of these misjoints were removed during the integration of the optical mapping and chromosome conformation capture data. Even though none of the centromeres were fully assembled, the scaffolds revealed large parts of some centromeric regions, even including some of the heterochromatic regions, which are not present in gold standard reference sequences.

摘要

长读长测序可以克服短读长在真核生物基因组组装中的弱点；然而，目前需要额外的支架构建来实现染色体水平的组装。我们生成了模式植物三个近缘种基因组的太平洋生物科学公司（PacBio）长读长数据，并将这三个基因组组装成仅几百个重叠群。为了提高这些组装的连续性，我们生成了BioNano Genomics光学图谱和Dovetail Genomics染色体构象捕获数据用于基因组支架构建。尽管它们在技术上存在差异，但光学图谱和染色体构象捕获的表现相似，N50值翻倍。在改进了两种整合方法后，组装的连续性达到了染色体臂水平。我们使用Illumina配对文库和遗传图谱信息严格评估了重叠群和支架的质量。这表明PacBio组装具有较高的序列准确性，但可能包含一些错配，这些错配连接了基因组中不相连的区域。在光学图谱和染色体构象捕获数据整合过程中，大部分（但不是全部）这些错配被消除。尽管没有一个着丝粒被完全组装，但支架揭示了一些着丝粒区域的大部分，甚至包括一些在金标准参考序列中不存在的异染色质区域。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7830/5411772/90471ca7f9d0/778f01.jpg

相似文献

Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

Genome Res. 2017 May;27(5):778-786. doi: 10.1101/gr.213652.116. Epub 2017 Feb 3.

Hybrid assembly of the large and highly repetitive genome of , a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm.

Genome Res. 2017 May;27(5):787-792. doi: 10.1101/gr.213405.116. Epub 2017 Jan 27.

Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula.

BMC Genomics. 2017 Aug 4;18(1):578. doi: 10.1186/s12864-017-3971-4.

Improving Illumina assemblies with Hi-C and long reads: An example with the North African dromedary.

Mol Ecol Resour. 2019 Jul;19(4):1015-1026. doi: 10.1111/1755-0998.13020. Epub 2019 May 17.

Whole-Genome Restriction Mapping by "Subhaploid"-Based RAD Sequencing: An Efficient and Flexible Approach for Physical Mapping and Genome Scaffolding.

Genetics. 2017 Jul;206(3):1237-1250. doi: 10.1534/genetics.117.200303. Epub 2017 May 3.

MaGuS: a tool for quality assessment and scaffolding of genome assemblies with Whole Genome Profiling™ Data.

BMC Bioinformatics. 2016 Mar 3;17:115. doi: 10.1186/s12859-016-0969-x.

Improving Nelumbo nucifera genome assemblies using high-resolution genetic maps and BioNano genome mapping reveals ancient chromosome rearrangements.

Plant J. 2018 May;94(4):721-734. doi: 10.1111/tpj.13894. Epub 2018 Apr 15.

Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps.

Nat Plants. 2018 Nov;4(11):879-887. doi: 10.1038/s41477-018-0289-4. Epub 2018 Nov 2.

Integration of mate pair sequences to improve shotgun assemblies of flow-sorted chromosome arms of hexaploid wheat.

BMC Genomics. 2013 Apr 4;14:222. doi: 10.1186/1471-2164-14-222.

High contiguity Arabidopsis thaliana genome assembly with a single nanopore flow cell.

Nat Commun. 2018 Feb 7;9(1):541. doi: 10.1038/s41467-018-03016-2.

引用本文的文献

Largest genome assembly in Brassicaceae: retrotransposon-driven genome expansion and karyotype evolution in Matthiola incana.

Plant Biotechnol J. 2025 Jun 26. doi: 10.1111/pbi.70193.

Mapping-based genome size estimation.

BMC Genomics. 2025 May 14;26(1):482. doi: 10.1186/s12864-025-11640-8.

A chromosome-level genome assembly of the varied leaved jewelflower, Streptanthus diversifolius, reveals a recent whole genome duplication.

G3 (Bethesda). 2025 Apr 17;15(4). doi: 10.1093/g3journal/jkaf022.

Integrating very high resolution environmental proxies in genotype-environment association studies.

Evol Appl. 2024 Jun 28;17(7):e13737. doi: 10.1111/eva.13737. eCollection 2024 Jul.

Cicer super-pangenome provides insights into species evolution and agronomic trait loci for crop improvement in chickpea.

Nat Genet. 2024 Jun;56(6):1225-1234. doi: 10.1038/s41588-024-01760-4. Epub 2024 May 23.

Lineage-specific gene duplication and expansion of DUF1216 gene family in Brassicaceae.

PLoS One. 2024 Apr 16;19(4):e0302292. doi: 10.1371/journal.pone.0302292. eCollection 2024.

Genomes of Meniocus linifolius and Tetracme quadricornis reveal the ancestral karyotype and genomic features of core Brassicaceae.

Plant Commun. 2024 Jul 8;5(7):100878. doi: 10.1016/j.xplc.2024.100878. Epub 2024 Mar 11.

High-quality assembly and methylome of a Tibetan wild tree peony genome ( reveal the evolution of giant genome architecture.

Hortic Res. 2023 Nov 10;10(12):uhad241. doi: 10.1093/hr/uhad241. eCollection 2023 Dec.

Advancing understanding of : a comprehensive genomic analysis reveals evolutionary patterns and metabolic pathway insights.

Front Plant Sci. 2023 Dec 7;14:1298417. doi: 10.3389/fpls.2023.1298417. eCollection 2023.

Economic and Productive Comparison of Rutin and Rutin-Loaded Chitosan Alginate Nanoparticles Against Lead-Induced Oxidative Stress in Cobb and Arbor Broiler Breeds.

Biol Trace Elem Res. 2024 Oct;202(10):4715-4734. doi: 10.1007/s12011-023-04019-x. Epub 2023 Dec 28.

本文引用的文献

Phased diploid genome assembly with single-molecule real-time sequencing.

Nat Methods. 2016 Dec;13(12):1050-1054. doi: 10.1038/nmeth.4035. Epub 2016 Oct 17.

Genome expansion of Arabis alpina linked with retrotransposition and reduced symmetric DNA methylation.

Nat Plants. 2015 Feb 2;1:14023. doi: 10.1038/nplants.2014.23.

Chromosome-scale shotgun assembly using an in vitro method for long-range linkage.

Genome Res. 2016 Mar;26(3):342-50. doi: 10.1101/gr.193474.115. Epub 2016 Feb 4.

Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum.

Nature. 2015 Nov 26;527(7579):508-11. doi: 10.1038/nature15714. Epub 2015 Nov 11.

A Time-Calibrated Road Map of Brassicaceae Species Radiation and Evolutionary History.

Plant Cell. 2015 Oct;27(10):2770-84. doi: 10.1105/tpc.15.00482. Epub 2015 Sep 26.

Assembly and diploid architecture of an individual human genome via single-molecule technologies.

Nat Methods. 2015 Aug;12(8):780-6. doi: 10.1038/nmeth.3454. Epub 2015 Jun 29.

Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.

Nat Biotechnol. 2015 Jun;33(6):623-30. doi: 10.1038/nbt.3238. Epub 2015 May 25.

Optical mapping in plant comparative genomics.

Gigascience. 2015 Feb 10;4:3. doi: 10.1186/s13742-015-0044-y. eCollection 2015.

One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly.

Curr Opin Microbiol. 2015 Feb;23:110-20. doi: 10.1016/j.mib.2014.11.014. Epub 2014 Dec 1.

A reference bacterial genome dataset generated on the MinION™ portable single-molecule nanopore sequencer.

Gigascience. 2014 Oct 20;3:22. doi: 10.1186/2047-217X-3-22. eCollection 2014.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用光学图谱和染色体构象捕获数据改进和校正三种植物物种长读长基因组组装的连续性

Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

作者信息

机构信息

Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, 50829 Cologne, Germany.

Earlham Institute, Norwich Research Park, Norwich NR4 7UH, United Kingdom.

出版信息

Genome Res. 2017 May;27(5):778-786. doi: 10.1101/gr.213652.116. Epub 2017 Feb 3.

DOI:10.1101/gr.213652.116

PMID:28159771

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5411772/

Abstract

摘要

利用光学图谱和染色体构象捕获数据改进和校正三种植物物种长读长基因组组装的连续性

Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

利用光学图谱和染色体构象捕获数据改进和校正三种植物物种长读长基因组组装的连续性

Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献