Department of Computer Science, University of California at Los Angeles, Los Angeles, CA 90095, USA.
Department of Computer Science, University of California at Los Angeles, Los Angeles, CA 90095, USA.
Methods. 2019 Aug 15;166:74-82. doi: 10.1016/j.ymeth.2019.03.003. Epub 2019 Mar 16.
The human microbiome plays a number of critical roles, impacting almost every aspect of human health and well-being. Conditions in the microbiome have been linked to a number of significant diseases. Additionally, revolutions in sequencing technology have led to a rapid increase in publicly-available sequencing data. Consequently, there have been growing efforts to predict disease status from metagenomic sequencing data, with a proliferation of new approaches in the last few years. Some of these efforts have explored utilizing a powerful form of machine learning called deep learning, which has been applied successfully in several biological domains. Here, we review some of these methods and the algorithms that they are based on, with a particular focus on deep learning methods. We also perform a deeper analysis of Type 2 Diabetes and obesity datasets that have eluded improved results, using a variety of machine learning and feature extraction methods. We conclude by offering perspectives on study design considerations that may impact results and future directions the field can take to improve results and offer more valuable conclusions. The scripts and extracted features for the analyses conducted in this paper are available via GitHub:https://github.com/nlapier2/metapheno.
人类微生物组发挥着许多关键作用,几乎影响到人类健康和福祉的各个方面。微生物组中的情况与许多重大疾病有关。此外,测序技术的革命导致了可公开获得的测序数据的快速增加。因此,人们越来越努力地从宏基因组测序数据中预测疾病状态,在过去几年中出现了许多新方法。其中一些研究探索了利用一种称为深度学习的强大机器学习形式,该技术已在多个生物领域得到成功应用。在这里,我们回顾了其中一些方法以及它们所基于的算法,特别关注深度学习方法。我们还使用各种机器学习和特征提取方法,对 2 型糖尿病和肥胖症数据集进行了更深入的分析,这些数据集的结果仍有待提高。最后,我们提供了一些关于可能影响结果的研究设计注意事项的观点,以及该领域可以采取哪些措施来提高结果并提供更有价值的结论。本文中进行的分析的脚本和提取的特征可在 GitHub 上获得:https://github.com/nlapier2/metapheno。