Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5, Takayama, Ikoma, Nara 630-0192, Japan.
Department of Computer Science, Bogor Agricultural University, Jl. Meranti Wing 20 Level 5 Kampus IPB Dramaga, Bogor, 16680, Indonesia.
Mol Inform. 2017 Dec;36(12). doi: 10.1002/minf.201700050. Epub 2017 Jul 6.
In order to obtain a better understanding why some Jamu formulas can be used to treat a specific disease, we performed metabolomic studies of Jamu by taking into consideration the biologically active compounds existing in plants used as Jamu ingredients. A thorough integration of information from omics is expected to provide solid evidence-based scientific rationales for the development of modern phytomedicines. This study focused on prediction of Jamu efficacy based on its component metabolites and also identification of important metabolites related to each efficacy group. Initially, we compared the performance of Support Vector Machines and Random Forest to predict the Jamu efficacy with three different data pre-processing approaches, such as no filtering, Single Filtering algorithm, and a combination of Single Filtering algorithm and feature selection using Regularized Random Forest. Both classifiers performed very well and according to 5-fold cross-validation results, the mean accuracy of Support Vector Machine with linear kernel was slightly better than Random Forest. It can be concluded that machine learning methods can successfully relate Jamu efficacy with metabolites. In addition, we extended our analysis by identifying important metabolites from the Random Forest model. The inTrees framework was used to extract the rules and to select important metabolites for each efficacy group. Overall, we identified 94 significant metabolites associated to 12 efficacy groups and many of them were validated by published literature and KNApSAcK Metabolite Activity database.
为了更好地理解为什么某些 Jamu 配方可以用于治疗特定疾病,我们考虑到作为 Jamu 成分的植物中存在的生物活性化合物,对 Jamu 进行了代谢组学研究。预计对组学信息的深入整合将为现代植物药的开发提供基于证据的科学依据。本研究侧重于基于其成分代谢物预测 Jamu 的功效,以及鉴定与每个功效组相关的重要代谢物。最初,我们比较了支持向量机和随机森林在三种不同数据预处理方法下预测 Jamu 功效的性能,例如无过滤、单过滤算法和单过滤算法与正则化随机森林特征选择的组合。两种分类器的性能都非常出色,根据 5 倍交叉验证结果,线性核支持向量机的平均准确性略优于随机森林。可以得出结论,机器学习方法可以成功地将 Jamu 功效与代谢物相关联。此外,我们通过从随机森林模型中识别重要代谢物来扩展我们的分析。使用 inTrees 框架从每个功效组中提取规则并选择重要代谢物。总体而言,我们确定了 94 个与 12 个功效组相关的重要代谢物,其中许多代谢物通过已发表的文献和 KNApSAcK 代谢物活性数据库进行了验证。