Research Group Bioinformatics, NG4, Robert Koch-Institut, Nordufer 20, 13353 Berlin, Germany.
Nucleic Acids Res. 2013 Jan 7;41(1):e10. doi: 10.1093/nar/gks803. Epub 2012 Aug 31.
One goal of sequencing-based metagenomic community analysis is the quantitative taxonomic assessment of microbial community compositions. In particular, relative quantification of taxons is of high relevance for metagenomic diagnostics or microbial community comparison. However, the majority of existing approaches quantify at low resolution (e.g. at phylum level), rely on the existence of special genes (e.g. 16S), or have severe problems discerning species with highly similar genome sequences. Yet, problems as metagenomic diagnostics require accurate quantification on species level. We developed Genome Abundance Similarity Correction (GASiC), a method to estimate true genome abundances via read alignment by considering reference genome similarities in a non-negative LASSO approach. We demonstrate GASiC's superior performance over existing methods on simulated benchmark data as well as on real data. In addition, we present applications to datasets of both bacterial DNA and viral RNA source. We further discuss our approach as an alternative to PCR-based DNA quantification.
基于测序的宏基因组群落分析的一个目标是对微生物群落组成进行定量分类学评估。特别是,分类群的相对定量对于宏基因组诊断或微生物群落比较具有重要意义。然而,现有的大多数方法都是在低分辨率(例如门水平)进行定量,依赖于特殊基因的存在(例如 16S),或者在辨别具有高度相似基因组序列的物种方面存在严重问题。然而,宏基因组诊断等问题需要在物种水平上进行准确的定量。我们开发了基因组丰度相似性校正(GASiC)方法,该方法通过考虑参考基因组在非负 LASSO 方法中的相似性,通过读取对齐来估计真实的基因组丰度。我们在模拟基准数据和真实数据上证明了 GASiC 优于现有方法的性能。此外,我们还将该方法应用于细菌 DNA 和病毒 RNA 源的数据集。我们进一步讨论了我们的方法作为 PCR 为基础的 DNA 定量的替代方法。