Suppr超能文献

从长读和短读序列数据推断复杂真菌群落的物种组成。

Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data.

机构信息

Research School of Biology, Australian National Universitygrid.1001.0, Canberra, ACT, Australia.

Molecular Mycology Research Laboratory, Centre for Infectious Diseases and Microbiology, Faculty of Medicine and Health, Sydney Medical School, Westmead Clinical School, The University of Sydneygrid.1013.3, Sydney, NSW, Australia.

出版信息

mBio. 2022 Apr 26;13(2):e0244421. doi: 10.1128/mbio.02444-21. Epub 2022 Apr 11.

Abstract

The kingdom Fungi is highly diverse in morphology and ecosystem function. Yet fungi are challenging to characterize as they can be difficult to culture and morphologically indistinct. Overall, their description and analysis lag far behind other microbes such as bacteria. Classification of species via high-throughput sequencing is increasingly becoming the norm for pathogen detection, microbiome studies, and environmental monitoring. With the rapid development of sequencing technologies, however, standardized procedures for taxonomic assignment of long sequence reads have not yet been well established. Focusing on nanopore sequencing technology, we compared classification and community composition analysis pipelines using shotgun and amplicon sequencing data generated from mock communities comprising 43 fungal species. We show that regardless of the sequencing methodology used, the highest accuracy of species identification was achieved by sequence alignment against a fungal-specific database. During the assessment of classification algorithms, we found that applying cutoffs to the query coverage of each read or contig significantly improved the classification accuracy and community composition analysis without major data loss. We also generated draft genome assemblies for three fungal species from nanopore data which were absent from genome databases. Our study improves sequence-based classification and estimation of relative sequence abundance using real fungal community data and provides a practical guide for the design of metagenomics analyses focusing on fungi. Our study is unique in that it provides an in-depth comparative study of a real-life complex fungal community analyzed with multiple long- and short-read sequencing approaches. These technologies and their application are currently of great interest to diverse biologists as they seek to characterize the community compositions of microbiomes. Although great progress has been made on bacterial community compositions, microbial eukaryotes such as fungi clearly lag behind. Our study provides a detailed breakdown of strategies to improve species identification with immediate relevance to real-world studies. We find that real-life data sets do not always behave as expected, distinct from reports based on simulated data sets.

摘要

真菌王国在形态和生态系统功能上具有高度多样性。然而,真菌很难被识别,因为它们很难培养,而且形态上没有明显区别。总的来说,它们的描述和分析远远落后于其他微生物,如细菌。通过高通量测序对物种进行分类越来越成为病原体检测、微生物组研究和环境监测的标准。然而,随着测序技术的快速发展,用于长序列读段分类分配的标准化程序尚未得到很好的建立。本研究聚焦于纳米孔测序技术,我们比较了使用来自包含 43 种真菌物种的模拟群落的鸟枪法和扩增子测序数据生成的分类和群落组成分析管道。结果表明,无论使用哪种测序方法,通过与真菌特异性数据库进行序列比对,都可以获得最高的物种识别准确率。在评估分类算法时,我们发现,对每个读段或拼接体的查询覆盖率应用截止值可以显著提高分类准确率和群落组成分析的准确性,而不会导致大量数据丢失。我们还从纳米孔数据中生成了三个真菌物种的草图基因组组装,这些物种在基因组数据库中不存在。我们的研究使用真实的真菌群落数据改进了基于序列的分类和相对序列丰度估计,并为以真菌为重点的宏基因组分析设计提供了实用指南。我们的研究是独特的,因为它提供了一个深入的比较研究,使用多种长读和短读测序方法分析了一个真实的复杂真菌群落。这些技术及其应用目前受到不同生物学家的极大关注,因为他们试图描述微生物组的群落组成。尽管在细菌群落组成方面取得了很大进展,但微生物真核生物(如真菌)显然落后了。我们的研究提供了一个详细的策略分解,以提高物种识别能力,这对现实世界的研究具有直接的意义。我们发现,现实数据集的行为并不总是如预期的那样,这与基于模拟数据集的报告不同。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20db/9040722/8025f6604f11/mbio.02444-21-f001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验