Suppr超能文献

高亮:低覆盖度宏基因组的基于应变的组装。

HyLight: Strain aware assembly of low coverage metagenomes.

机构信息

College of Biology, Hunan University, Changsha, China.

Genome Data Science, Faculty of Technology, Bielefeld University, Bielefeld, Germany.

出版信息

Nat Commun. 2024 Oct 7;15(1):8665. doi: 10.1038/s41467-024-52907-0.

Abstract

Different strains of identical species can vary substantially in terms of their spectrum of biomedically relevant phenotypes. Reconstructing the genomes of microbial communities at the level of their strains poses significant challenges, because sequencing errors can obscure strain-specific variants. Next-generation sequencing (NGS) reads are too short to resolve complex genomic regions. Third-generation sequencing (TGS) reads, although longer, are prone to higher error rates or substantially more expensive. Limiting TGS coverage to reduce costs compromises the accuracy of the assemblies. This explains why prior approaches agree on losses in strain awareness, accuracy, tendentially excessive costs, or combinations thereof. We introduce HyLight, a metagenome assembly approach that addresses these challenges by implementing the complementary strengths of TGS and NGS data. HyLight employs strain-resolved overlap graphs (OG) to accurately reconstruct individual strains within microbial communities. Our experiments demonstrate that HyLight produces strain-aware and contiguous assemblies at minimal error content, while significantly reducing costs because utilizing low-coverage TGS data. HyLight achieves an average improvement of 19.05% in preserving strain identity and demonstrates near-complete strain awareness across diverse datasets. In summary, HyLight offers considerable advances in metagenome assembly, insofar as it delivers significantly enhanced strain awareness, contiguity, and accuracy without the typical compromises observed in existing approaches.

摘要

不同株系的同一物种在其与生物医学相关表型的谱方面可能有很大的差异。在菌株水平重建微生物群落的基因组提出了重大挑战,因为测序错误会掩盖菌株特异性变体。下一代测序(NGS)读段太短,无法解析复杂的基因组区域。第三代测序(TGS)读段虽然较长,但容易出现更高的错误率或成本大幅增加。为了降低成本而限制 TGS 覆盖范围会影响组装的准确性。这就是为什么先前的方法都存在菌株意识丧失、准确性降低、成本增加(或三者兼有)的原因。我们引入了 HyLight,这是一种宏基因组组装方法,通过整合 TGS 和 NGS 数据的互补优势来应对这些挑战。HyLight 采用基于菌株分辨率的重叠图(OG)来准确重建微生物群落中的个体菌株。我们的实验表明,HyLight 以最小的错误含量生成具有菌株意识的连续组装,同时显著降低成本,因为它利用低覆盖 TGS 数据。HyLight 在保留菌株身份方面平均提高了 19.05%,并在各种数据集上实现了近乎完整的菌株意识。总之,HyLight 在宏基因组组装方面取得了重大进展,因为它在不牺牲现有方法中典型折衷的情况下,显著提高了菌株意识、连续性和准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/92e7/11458758/8fd85ad198b5/41467_2024_52907_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验