Morfopoulou Sofia, Plagnol Vincent
UCL Genetics Institute, University College London, London WC1E 6BT, UK.
Bioinformatics. 2015 Sep 15;31(18):2930-8. doi: 10.1093/bioinformatics/btv317. Epub 2015 May 21.
Deep sequencing of clinical samples is now an established tool for the detection of infectious pathogens, with direct medical applications. The large amount of data generated produces an opportunity to detect species even at very low levels, provided that computational tools can effectively profile the relevant metagenomic communities. Data interpretation is complicated by the fact that short sequencing reads can match multiple organisms and by the lack of completeness of existing databases, in particular for viral pathogens. Here we present metaMix, a Bayesian mixture model framework for resolving complex metagenomic mixtures. We show that the use of parallel Monte Carlo Markov chains for the exploration of the species space enables the identification of the set of species most likely to contribute to the mixture.
We demonstrate the greater accuracy of metaMix compared with relevant methods, particularly for profiling complex communities consisting of several related species. We designed metaMix specifically for the analysis of deep transcriptome sequencing datasets, with a focus on viral pathogen detection; however, the principles are generally applicable to all types of metagenomic mixtures.
metaMix is implemented as a user friendly R package, freely available on CRAN: http://cran.r-project.org/web/packages/metaMix
sofia.morfopoulou.10@ucl.ac.uk
Supplementary data are available at Bionformatics online.
对临床样本进行深度测序现已成为检测感染性病原体的既定工具,并具有直接的医学应用。只要计算工具能够有效地分析相关的宏基因组群落,所产生的大量数据就为检测极低水平的物种提供了机会。数据解读因短测序读段可与多种生物体匹配以及现有数据库的不完整性(特别是对于病毒病原体)而变得复杂。在此,我们提出了metaMix,这是一种用于解析复杂宏基因组混合物的贝叶斯混合模型框架。我们表明,使用并行蒙特卡罗马尔可夫链来探索物种空间能够识别最有可能构成混合物的物种集合。
我们证明了metaMix与相关方法相比具有更高的准确性,特别是在分析由几个相关物种组成的复杂群落时。我们专门为深度转录组测序数据集的分析设计了metaMix,重点是病毒病原体检测;然而,这些原则通常适用于所有类型的宏基因组混合物。
metaMix作为一个用户友好的R包实现,可在CRAN上免费获取:http://cran.r-project.org/web/packages/metaMix
sofia.morfopoulou.10@ucl.ac.uk
补充数据可在《生物信息学》在线获取。