Forssen Henrietta, Patel Riyaz, Fitzpatrick Natalie, Hingorani Aroon, Timmis Adam, Hemingway Harry, Denaxas Spiros
Department of Computer Science, UCL.
Institute of Health Informatics, UCL.
Stud Health Technol Inform. 2017;235:111-115.
Metabolomic data can potentially enable accurate, non-invasive and low-cost prediction of coronary artery disease. Regression-based analytical approaches however might fail to fully account for interactions between metabolites, rely on a priori selected input features and thus might suffer from poorer accuracy. Supervised machine learning methods can potentially be used in order to fully exploit the dimensionality and richness of the data. In this paper, we systematically implement and evaluate a set of supervised learning methods (L1 regression, random forest classifier) and compare them to traditional regression-based approaches for disease prediction using metabolomic data.
代谢组学数据有可能实现对冠状动脉疾病的准确、非侵入性和低成本预测。然而,基于回归的分析方法可能无法充分考虑代谢物之间的相互作用,依赖于预先选择的输入特征,因此可能准确性较差。监督式机器学习方法有可能被用于充分利用数据的维度和丰富性。在本文中,我们系统地实施和评估了一组监督学习方法(L1回归、随机森林分类器),并将它们与使用代谢组学数据进行疾病预测的传统基于回归的方法进行比较。