Department of Electrical and Computer Engineering, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA, USA.
BMC Bioinformatics. 2019 Jun 20;20(Suppl 12):314. doi: 10.1186/s12859-019-2833-2.
Microbiome profiles in the human body and environment niches have become publicly available due to recent advances in high-throughput sequencing technologies. Indeed, recent studies have already identified different microbiome profiles in healthy and sick individuals for a variety of diseases; this suggests that the microbiome profile can be used as a diagnostic tool in identifying the disease states of an individual. However, the high-dimensional nature of metagenomic data poses a significant challenge to existing machine learning models. Consequently, to enable personalized treatments, an efficient framework that can accurately and robustly differentiate between healthy and sick microbiome profiles is needed.
In this paper, we propose MetaNN (i.e., classification of host phenotypes from Metagenomic data using Neural Networks), a neural network framework which utilizes a new data augmentation technique to mitigate the effects of data over-fitting.
We show that MetaNN outperforms existing state-of-the-art models in terms of classification accuracy for both synthetic and real metagenomic data. These results pave the way towards developing personalized treatments for microbiome related diseases.
由于高通量测序技术的最新进展,人体和环境生态位中的微生物组谱已公开可用。事实上,最近的研究已经为各种疾病确定了健康个体和患病个体之间不同的微生物组谱;这表明微生物组谱可以用作识别个体疾病状态的诊断工具。然而,宏基因组数据的高维性质对现有机器学习模型构成了重大挑战。因此,为了实现个性化治疗,需要一种能够准确、稳健地区分健康和患病微生物组谱的高效框架。
在本文中,我们提出了 MetaNN(即使用神经网络对宏基因组数据中的宿主表型进行分类),这是一种神经网络框架,它利用一种新的数据增强技术来减轻数据过拟合的影响。
我们表明,MetaNN 在合成和真实宏基因组数据的分类准确性方面优于现有的最先进模型。这些结果为开发与微生物组相关的疾病的个性化治疗铺平了道路。