植物基因组测序项目技术的关键比较

A critical comparison of technologies for a plant genome sequencing project.

作者信息

Paajanen Pirita, Kettleborough George, López-Girona Elena, Giolai Michael, Heavens Darren, Baker David, Lister Ashleigh, Cugliandolo Fiorella, Wilde Gail, Hein Ingo, Macaulay Iain, Bryan Glenn J, Clark Matthew D

机构信息

Technology Development, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK.

Department of Cell and Developmental Biology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK.

出版信息

Gigascience. 2019 Mar 1;8(3). doi: 10.1093/gigascience/giy163.

DOI:10.1093/gigascience/giy163

PMID:30624602

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6423373/

Abstract

BACKGROUND

A high-quality genome sequence of any model organism is an essential starting point for genetic and other studies. Older clone-based methods are slow and expensive, whereas faster, cheaper short-read-only assemblies can be incomplete and highly fragmented, which minimizes their usefulness. The last few years have seen the introduction of many new technologies for genome assembly. These new technologies and associated new algorithms are typically benchmarked on microbial genomes or, if they scale appropriately, on larger (e.g., human) genomes. However, plant genomes can be much more repetitive and larger than the human genome, and plant biochemistry often makes obtaining high-quality DNA that is free from contaminants difficult. Reflecting their challenging nature, we observe that plant genome assembly statistics are typically poorer than for vertebrates.

RESULTS

Here, we compare Illumina short read, Pacific Biosciences long read, 10x Genomics linked reads, Dovetail Hi-C, and BioNano Genomics optical maps, singly and combined, in producing high-quality long-range genome assemblies of the potato species Solanum verrucosum. We benchmark the assemblies for completeness and accuracy, as well as DNA compute requirements and sequencing costs.

CONCLUSIONS

The field of genome sequencing and assembly is reaching maturity, and the differences we observe between assemblies are surprisingly small. We expect that our results will be helpful to other genome projects, and that these datasets will be used in benchmarking by assembly algorithm developers.

摘要

背景

任何模式生物的高质量基因组序列都是遗传研究及其他研究的重要起点。基于克隆的传统方法速度慢且成本高，而更快、更便宜的仅使用短读长的组装方法可能不完整且高度碎片化，这使其实用性大打折扣。在过去几年中，出现了许多用于基因组组装的新技术。这些新技术及相关新算法通常在微生物基因组上进行基准测试，或者如果它们能够适当扩展，则在更大的（例如人类）基因组上进行测试。然而，植物基因组可能比人类基因组更具重复性且更大，并且植物生物化学特性常常使得获取无污染的高质量DNA变得困难。鉴于其具有挑战性的特性，我们观察到植物基因组组装统计数据通常比脊椎动物的要差。

结果

在此，我们比较了Illumina短读长、太平洋生物科学公司的长读长、10x Genomics连接读长、Dovetail Hi-C和BioNano Genomics光学图谱，单独使用以及组合使用时，在生成马铃薯物种疣粒野生种的高质量长程基因组组装方面的效果。我们对组装的完整性、准确性以及DNA计算需求和测序成本进行了基准测试。

结论

基因组测序和组装领域正在走向成熟，我们观察到的不同组装方法之间的差异小得出奇。我们预计我们的结果将对其他基因组项目有所帮助，并且这些数据集将被组装算法开发者用于基准测试。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a97b/6423373/9a5c614a1488/giy163fig1.jpg

相似文献

A critical comparison of technologies for a plant genome sequencing project.植物基因组测序项目技术的关键比较

Gigascience. 2019 Mar 1;8(3). doi: 10.1093/gigascience/giy163.

Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula.通过豆科植物模式物种蒺藜苜蓿的第二个参考质量组装来探索优化BioNano和Dovetail的策略。

BMC Genomics. 2017 Aug 4;18(1):578. doi: 10.1186/s12864-017-3971-4.

Highly accurate long reads are crucial for realizing the potential of biodiversity genomics.高质量的长读长序列对于实现生物多样性基因组学的潜力至关重要。

BMC Genomics. 2023 Mar 16;24(1):117. doi: 10.1186/s12864-023-09193-9.

Comparison of long-read methods for sequencing and assembly of a plant genome.长读测序和组装植物基因组方法的比较。

Gigascience. 2020 Dec 21;9(12). doi: 10.1093/gigascience/giaa146.

Benchmarking multi-platform sequencing technologies for human genome assembly.多平台测序技术在人类基因组组装中的基准测试。

Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad300.

Evaluating long-read de novo assembly tools for eukaryotic genomes: insights and considerations.评估真核生物基因组的长读长从头组装工具：见解与考虑。

Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad100. Epub 2023 Nov 24.

Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.利用光学图谱和染色体构象捕获数据改进和校正三种植物物种长读长基因组组装的连续性

Genome Res. 2017 May;27(5):778-786. doi: 10.1101/gr.213652.116. Epub 2017 Feb 3.

Hybrid assembly of the large and highly repetitive genome of , a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm.利用MaSuRCA巨读算法对面包小麦的祖先之一——[具体物种名称未给出]的大型高度重复基因组进行混合组装。

Genome Res. 2017 May;27(5):787-792. doi: 10.1101/gr.213405.116. Epub 2017 Jan 27.

Improving Illumina assemblies with Hi-C and long reads: An example with the North African dromedary.利用 Hi-C 和长读长测序技术提高 Illumina 组装质量：以北非单峰驼为例。

Mol Ecol Resour. 2019 Jul;19(4):1015-1026. doi: 10.1111/1755-0998.13020. Epub 2019 May 17.

Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes.比较长读长测序技术在复杂细菌基因组混合组装中的应用。

Microb Genom. 2019 Sep;5(9). doi: 10.1099/mgen.0.000294. Epub 2019 Aug 30.

引用本文的文献

Genotype and phenotype data standardization, utilization and integration in the big data era for agricultural sciences.基因型和表型数据在农业科学大数据时代的标准化、利用和整合。

Database (Oxford). 2023 Dec 11;2023. doi: 10.1093/database/baad088.

A first draft genome of holm oak ( subsp. ), the most representative species of the Mediterranean forest and the Spanish agrosylvopastoral ecosystem "".地中海森林和西班牙农林牧生态系统中最具代表性的物种——圣栎（亚种）的初稿基因组。

Front Mol Biosci. 2023 Oct 12;10:1242943. doi: 10.3389/fmolb.2023.1242943. eCollection 2023.

Abundance and Diversification of Repetitive Elements in Decapoda Genomes.十足目基因组中重复元件的丰度和多样性。

Genes (Basel). 2023 Aug 15;14(8):1627. doi: 10.3390/genes14081627.

Ancient Horizontal Gene Transfers from Plastome to Mitogenome of a Nonphotosynthetic Orchid, (Epidendroideae, Orchidaceae).古水平基因转移来自非光合兰科（Epidendroideae，兰科）的质体到线粒体基因组。

Int J Mol Sci. 2023 Jul 14;24(14):11448. doi: 10.3390/ijms241411448.

Plant genome sequence assembly in the era of long reads: Progress, challenges and future directions.长读长测序时代的植物基因组序列组装：进展、挑战与未来方向

Quant Plant Biol. 2022 Mar 11;3:e5. doi: 10.1017/qpb.2021.18. eCollection 2022.

Chloroplast genome draft assembly of Falcataria moluccana using hybrid sequencing technology.利用混合测序技术完成的密花猴耳环叶绿体基因组草案组装。

BMC Res Notes. 2023 Mar 9;16(1):31. doi: 10.1186/s13104-023-06290-6.

Recognition of Pep-13/25 MAMPs of localizes to an locus in .对定位于……中一个位点的Pep-13/25微生物相关分子模式的识别。（注：原文部分表述不完整，翻译可能存在一定局限性）

Front Plant Sci. 2023 Jan 12;13:1037030. doi: 10.3389/fpls.2022.1037030. eCollection 2022.

Assembly of complete diploid-phased chromosomes from draft genome sequences.从草图基因组序列组装完整的二倍体相染色体。

G3 (Bethesda). 2022 Jul 29;12(8). doi: 10.1093/g3journal/jkac143.

De Novo Reference Assembly of the Upriver Orange Mangrove (Bruguiera sexangula) Genome.从头构建上游橙（Bruguiera sexangula）基因组参考序列。

Genome Biol Evol. 2022 Feb 4;14(2). doi: 10.1093/gbe/evac025.

Novel Approaches for Species Concepts and Delimitation in Polyploids and Hybrids.多倍体和杂交种中物种概念及界定的新方法

Plants (Basel). 2022 Jan 13;11(2):204. doi: 10.3390/plants11020204.

本文引用的文献

De novo assembly of haplotype-resolved genomes with trio binning.利用三人分箱法对单倍型解析基因组进行从头组装。

Nat Biotechnol. 2018 Oct 22. doi: 10.1038/nbt.4277.

Nanopore sequencing and assembly of a human genome with ultra-long reads.纳米孔测序和超长读长组装人类基因组。

Nat Biotechnol. 2018 Apr;36(4):338-345. doi: 10.1038/nbt.4060. Epub 2018 Jan 29.

De Novo Assembly of a New Accession Using Nanopore Sequencing.使用纳米孔测序从头组装一个新的访问号。

Plant Cell. 2017 Oct;29(10):2336-2348. doi: 10.1105/tpc.17.00521. Epub 2017 Oct 12.

Construction of a map-based reference genome sequence for barley, Hordeum vulgare L.构建大麦（Hordeum vulgare L.）基于图谱的参考基因组序列

Sci Data. 2017 Apr 27;4:170044. doi: 10.1038/sdata.2017.44.

A chromosome conformation capture ordered sequence of the barley genome.一个基于染色体构象捕获技术的大麦基因组测序顺序。

Nature. 2017 Apr 26;544(7651):427-433. doi: 10.1038/nature22043.

Direct determination of diploid genome sequences.二倍体基因组序列的直接测定。

Genome Res. 2017 May;27(5):757-767. doi: 10.1101/gr.214874.116. Epub 2017 Apr 5.

Genomic innovation for crop improvement.基因组创新促进作物改良。

Nature. 2017 Mar 15;543(7645):346-354. doi: 10.1038/nature22011.

Canu: scalable and accurate long-read assembly via adaptive -mer weighting and repeat separation.Canu：通过自适应k-mer加权和重复序列分离实现可扩展且准确的长读长序列拼接

Genome Res. 2017 May;27(5):722-736. doi: 10.1101/gr.215087.116. Epub 2017 Mar 15.

The impact of third generation genomic technologies on plant genome assembly.第三代基因组技术对植物基因组组装的影响。

Curr Opin Plant Biol. 2017 Apr;36:64-70. doi: 10.1016/j.pbi.2017.02.002. Epub 2017 Feb 21.

KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies.KAT：一个用于对二代测序数据集和基因组组装进行质量控制的K-mer分析工具包。

Bioinformatics. 2017 Feb 15;33(4):574-576. doi: 10.1093/bioinformatics/btw663.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

植物基因组测序项目技术的关键比较

A critical comparison of technologies for a plant genome sequencing project.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献