Kim Jihun, Koh Hyunwook
Department of Applied Mathematics and Statistics, The State University of New York, Korea (SUNY Korea), Incheon 21985, Republic of Korea.
Microorganisms. 2023 Nov 20;11(11):2816. doi: 10.3390/microorganisms11112816.
The advent of next-generation sequencing has greatly accelerated the field of human microbiome studies. Currently, investigators are seeking, struggling and competing to find new ways to diagnose, treat and prevent human diseases through the human microbiome. Machine learning is a promising approach to help such an effort, especially due to the high complexity of microbiome data. However, many of the current machine learning algorithms are in a "black box", i.e., they are difficult to understand and interpret. In addition, clinicians, public health practitioners and biologists are not usually skilled at computer programming, and they do not always have high-end computing devices. Thus, in this study, we introduce a unified web cloud analytic platform, named MiTree, for user-friendly and interpretable microbiome data mining. MiTree employs tree-based learning methods, including decision tree, random forest and gradient boosting, that are well understood and suited to human microbiome studies. We also stress that MiTree can address both classification and regression problems through covariate-adjusted or unadjusted analysis. MiTree should serve as an easy-to-use and interpretable data mining tool for microbiome-based disease prediction modeling, and should provide new insights into microbiome-based diagnostics, treatment and prevention. MiTree is an open-source software that is available on our web server.
下一代测序技术的出现极大地加速了人类微生物组研究领域的发展。目前,研究人员正在寻找、努力并竞争通过人类微生物组来发现诊断、治疗和预防人类疾病的新方法。机器学习是帮助实现这一目标的一种有前景的方法,特别是由于微生物组数据的高度复杂性。然而,当前许多机器学习算法处于“黑箱”状态,即它们难以理解和解释。此外,临床医生、公共卫生从业者和生物学家通常不擅长计算机编程,并且他们并不总是拥有高端计算设备。因此,在本研究中,我们引入了一个名为MiTree的统一网络云分析平台,用于用户友好且可解释的微生物组数据挖掘。MiTree采用基于树的学习方法,包括决策树、随机森林和梯度提升,这些方法易于理解且适用于人类微生物组研究。我们还强调,MiTree可以通过协变量调整或未调整分析来解决分类和回归问题。MiTree应作为一种易于使用且可解释的数据挖掘工具,用于基于微生物组的疾病预测建模,并应为基于微生物组的诊断、治疗和预防提供新的见解。MiTree是一款开源软件,可在我们的网络服务器上获取。
Bioengineering (Basel). 2024-1-8
Biol Methods Protoc. 2023-10-4
BMC Med Inform Decis Mak. 2023-6-5
BMC Bioinformatics. 2021-4-7
Bioengineering (Basel). 2024-1-8
ISME Commun. 2022-10-6
Biol Methods Protoc. 2023-10-4
Nat Commun. 2022-11-10
Nat Rev Genet. 2023-2
Signal Transduct Target Ther. 2022-4-23
PLoS Comput Biol. 2022-4