Quantitative and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA.
Department of Pediatrics, Division of Gastroenterology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
Genome Biol. 2019 Aug 6;20(1):154. doi: 10.1186/s13059-019-1773-5.
We develop a metagenomic data analysis pipeline, MicroPro, that takes into account all reads from known and unknown microbial organisms and associates viruses with complex diseases. We utilize MicroPro to analyze four metagenomic datasets relating to colorectal cancer, type 2 diabetes, and liver cirrhosis and show that including reads from unknown organisms significantly increases the prediction accuracy of the disease status for three of the four datasets. We identify new microbial organisms associated with these diseases and show viruses play important prediction roles in colorectal cancer and liver cirrhosis, but not in type 2 diabetes. MicroPro is freely available at https://github.com/zifanzhu/MicroPro .
我们开发了一个宏基因组数据分析管道 MicroPro,它考虑了来自已知和未知微生物的所有读取,并将病毒与复杂疾病联系起来。我们利用 MicroPro 分析了四个与结直肠癌、2 型糖尿病和肝硬化相关的宏基因组数据集,结果表明,包含来自未知生物体的读取可显著提高其中三个数据集的疾病状态预测准确性。我们鉴定了与这些疾病相关的新微生物,并表明病毒在结直肠癌和肝硬化中发挥了重要的预测作用,但在 2 型糖尿病中没有。MicroPro 可在 https://github.com/zifanzhu/MicroPro 免费获取。