链接：利用长读长对草图基因组进行可扩展的、无比对的支架搭建。

LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads.

作者信息

Warren René L, Yang Chen, Vandervalk Benjamin P, Behsaz Bahar, Lagman Albert, Jones Steven J M, Birol Inanç

机构信息

BC Cancer Agency, Michael Smith Genome Sciences Centre, Vancouver, British Columbia V5Z 4S6 Canada.

出版信息

Gigascience. 2015 Aug 4;4:35. doi: 10.1186/s13742-015-0076-3. eCollection 2015.

DOI:10.1186/s13742-015-0076-3

PMID:26244089

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4524009/

Abstract

BACKGROUND

Owing to the complexity of the assembly problem, we do not yet have complete genome sequences. The difficulty in assembling reads into finished genomes is exacerbated by sequence repeats and the inability of short reads to capture sufficient genomic information to resolve those problematic regions. In this regard, established and emerging long read technologies show great promise, but their current associated higher error rates typically require computational base correction and/or additional bioinformatics pre-processing before they can be of value.

RESULTS

We present LINKS, the Long Interval Nucleotide K-mer Scaffolder algorithm, a method that makes use of the sequence properties of nanopore sequence data and other error-containing sequence data, to scaffold high-quality genome assemblies, without the need for read alignment or base correction. Here, we show how the contiguity of an ABySS Escherichia coli K-12 genome assembly can be increased greater than five-fold by the use of beta-released Oxford Nanopore Technologies Ltd. long reads and how LINKS leverages long-range information in Saccharomyces cerevisiae W303 nanopore reads to yield assemblies whose resulting contiguity and correctness are on par with or better than that of competing applications. We also present the re-scaffolding of the colossal white spruce (Picea glauca) draft assembly (PG29, 20 Gbp) and demonstrate how LINKS scales to larger genomes.

CONCLUSIONS

This study highlights the present utility of nanopore reads for genome scaffolding in spite of their current limitations, which are expected to diminish as the nanopore sequencing technology advances. We expect LINKS to have broad utility in harnessing the potential of long reads in connecting high-quality sequences of small and large genome assembly drafts.

摘要

背景

由于组装问题的复杂性，我们尚未获得完整的基因组序列。将 reads 组装成完整基因组的困难因序列重复以及短 reads 无法捕获足够的基因组信息来解析这些问题区域而加剧。在这方面，成熟的和新兴的长 reads 技术显示出巨大的潜力，但它们目前较高的错误率通常需要进行计算碱基校正和/或额外的生物信息学预处理才能发挥作用。

结果

我们提出了 LINKS（长间隔核苷酸 k-mer 支架算法），这是一种利用纳米孔序列数据和其他含错误序列数据的序列特性来构建高质量基因组组装体的方法，无需进行 reads 比对或碱基校正。在这里，我们展示了通过使用β版本发布的牛津纳米孔技术有限公司的长 reads，ABySS 大肠杆菌 K-12 基因组组装体的连续性如何能提高五倍以上，以及 LINKS 如何利用酿酒酵母 W303 纳米孔 reads 中的长程信息来产生连续性和正确性与竞争应用相当或更好的组装体。我们还展示了巨大白云杉（Picea glauca）草图组装体（PG29，20 Gbp）的重新支架构建，并证明了 LINKS 如何扩展到更大的基因组。

结论

本研究强调了尽管纳米孔 reads 目前存在局限性，但它们在基因组支架构建中的当前效用，随着纳米孔测序技术的进步，这些局限性预计会减少。我们期望 LINKS 在利用长 reads 的潜力来连接小型和大型基因组组装草图的高质量序列方面具有广泛的用途。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/744a/4524009/3aef60e43c07/13742_2015_76_Fig1_HTML.jpg

相似文献

LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads.

Gigascience. 2015 Aug 4;4:35. doi: 10.1186/s13742-015-0076-3. eCollection 2015.

LongStitch: high-quality genome assembly correction and scaffolding using long reads.

BMC Bioinformatics. 2021 Oct 30;22(1):534. doi: 10.1186/s12859-021-04451-7.

LRScaf: improving draft genomes using long noisy reads.

BMC Genomics. 2019 Dec 9;20(1):955. doi: 10.1186/s12864-019-6337-2.

Evaluation of strategies for the assembly of diverse bacterial genomes using MinION long-read sequencing.

BMC Genomics. 2019 Jan 9;20(1):23. doi: 10.1186/s12864-018-5381-7.

Polishing the Oxford Nanopore long-read assemblies of bacterial pathogens with Illumina short reads to improve genomic analyses.

Genomics. 2021 May;113(3):1366-1377. doi: 10.1016/j.ygeno.2021.03.018. Epub 2021 Mar 11.

ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers.

BMC Bioinformatics. 2018 Jun 20;19(1):234. doi: 10.1186/s12859-018-2243-x.

Genome assembly using Nanopore-guided long and error-free DNA reads.

BMC Genomics. 2015 Apr 20;16(1):327. doi: 10.1186/s12864-015-1519-z.

SLR: a scaffolding algorithm based on long reads and contig classification.

BMC Bioinformatics. 2019 Oct 30;20(1):539. doi: 10.1186/s12859-019-3114-9.

Assembly of chloroplast genomes with long- and short-read data: a comparison of approaches using Eucalyptus pauciflora as a test case.

BMC Genomics. 2018 Dec 29;19(1):977. doi: 10.1186/s12864-018-5348-8.

ARCS: scaffolding genome drafts with linked reads.

Bioinformatics. 2018 Mar 1;34(5):725-731. doi: 10.1093/bioinformatics/btx675.

引用本文的文献

Sex Differentiation and Long-Distance Gene Flow in the Elusive Antarctic Fish .

Ecol Evol. 2025 Aug 20;15(8):e71847. doi: 10.1002/ece3.71847. eCollection 2025 Aug.

A taxogenomic view of the genus : an expansion from ten to twenty-two species.

Persoonia. 2025 Jun;54:265-283. doi: 10.3114/persoonia.2025.54.08. Epub 2025 May 23.

A genome resource for the marine annelid Platynereis spp.

BMC Genomics. 2025 Jul 14;26(1):665. doi: 10.1186/s12864-025-11727-2.

Genome assembly and annotation of Babesia rossi, a protozoan parasite for canine babesiosis.

BMC Genomics. 2025 Jul 1;26(1):579. doi: 10.1186/s12864-025-11495-z.

A high-quality genome assembly of Annona squamosa (custard apple) provides functional insights into an emerging fruit crop.

DNA Res. 2025 May 28;32(3). doi: 10.1093/dnares/dsaf007.

Chromosome-level reference genome assembly of the gyrfalcon (Falco rusticolus) and population genomics offer insights into the falcon population in Mongolia.

Sci Rep. 2025 Feb 4;15(1):4154. doi: 10.1038/s41598-025-88216-9.

De novo transcriptome assembly and discovery of drought-responsive genes in white spruce (Picea glauca).

PLoS One. 2025 Jan 3;20(1):e0316661. doi: 10.1371/journal.pone.0316661. eCollection 2025.

Near-complete telomere-to-telomere de novo genome assembly in Egyptian clover (Trifolium alexandrinum).

DNA Res. 2024 Dec 27;32(1). doi: 10.1093/dnares/dsae036.

Characterization and complete genome sequence of highly lytic phage active against methicillin-resistant Staphylococcus aureus (MRSA) isolated from Egypt.

Virol J. 2024 Nov 8;21(1):284. doi: 10.1186/s12985-024-02554-0.

High-quality chromosome-level genome assembly of female reveals sex chromosome and gene organization.

Heliyon. 2024 Sep 28;10(19):e38687. doi: 10.1016/j.heliyon.2024.e38687. eCollection 2024 Oct 15.

本文引用的文献

Oxford Nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome.

Genome Res. 2015 Nov;25(11):1750-6. doi: 10.1101/gr.191395.115. Epub 2015 Oct 7.

Sealer: a scalable gap-closing application for finishing draft genomes.

BMC Bioinformatics. 2015 Jul 25;16(1):230. doi: 10.1186/s12859-015-0663-4.

A complete bacterial genome assembled de novo using only nanopore sequencing data.

Nat Methods. 2015 Aug;12(8):733-5. doi: 10.1038/nmeth.3444. Epub 2015 Jun 15.

Improved white spruce (Picea glauca) genome assemblies and annotation of large gene families of conifer terpenoid and phenolic defense metabolism.

Plant J. 2015 Jul;83(2):189-212. doi: 10.1111/tpj.12886. Epub 2015 Jun 19.

Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.

Nat Biotechnol. 2015 Jun;33(6):623-30. doi: 10.1038/nbt.3238. Epub 2015 May 25.

Genome assembly using Nanopore-guided long and error-free DNA reads.

BMC Genomics. 2015 Apr 20;16(1):327. doi: 10.1186/s12864-015-1519-z.

Improved data analysis for the MinION nanopore sequencer.

Nat Methods. 2015 Apr;12(4):351-6. doi: 10.1038/nmeth.3290. Epub 2015 Feb 16.

MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island.

Nat Biotechnol. 2015 Mar;33(3):296-300. doi: 10.1038/nbt.3103. Epub 2014 Dec 8.

One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly.

Curr Opin Microbiol. 2015 Feb;23:110-20. doi: 10.1016/j.mib.2014.11.014. Epub 2014 Dec 1.

Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement.

PLoS One. 2014 Nov 19;9(11):e112963. doi: 10.1371/journal.pone.0112963. eCollection 2014.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

链接：利用长读长对草图基因组进行可扩展的、无比对的支架搭建。

LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads.

作者信息

Warren René L, Yang Chen, Vandervalk Benjamin P, Behsaz Bahar, Lagman Albert, Jones Steven J M, Birol Inanç

机构信息

BC Cancer Agency, Michael Smith Genome Sciences Centre, Vancouver, British Columbia V5Z 4S6 Canada.