EcoVision Lab, Photogrammetry and Remote Sensing Group, ETH Zürich, 8092 Zurich, Switzerland.
Department of Human Sciences and Quality of Life Promotion, San Raffaele University, 00166 Rome, Italy.
Nutrients. 2022 Apr 20;14(9):1705. doi: 10.3390/nu14091705.
Nutritional epidemiology employs observational data to discover associations between diet and disease risk. However, existing analytic methods of dietary data are often sub-optimal, with limited incorporation and analysis of the correlations between the studied variables and nonlinear behaviours in the data. Machine learning (ML) is an area of artificial intelligence that has the potential to improve modelling of nonlinear associations and confounding which are found in nutritional data. These opportunities notwithstanding, the applications of ML in nutritional epidemiology must be approached cautiously to safeguard the scientific quality of the results and provide accurate interpretations. Given the complex scenario around ML, judicious application of such tools is necessary to offer nutritional epidemiology a novel analytical resource for dietary measurement and assessment and a tool to model the complexity of dietary intake and its relation to health. This work describes the applications of ML in nutritional epidemiology and provides guidelines to avoid common pitfalls encountered in applying predictive statistical models to nutritional data. Furthermore, it helps unfamiliar readers better assess the significance of their results and provides new possible future directions in the field of ML in nutritional epidemiology.
营养流行病学利用观察性数据来发现饮食与疾病风险之间的关联。然而,现有的饮食数据分析方法往往不够理想,对所研究变量之间的相关性以及数据中的非线性行为的综合分析能力有限。机器学习 (ML) 是人工智能的一个领域,它有可能改进营养数据中发现的非线性关联和混杂因素的建模。尽管存在这些机会,但在营养流行病学中应用 ML 必须谨慎,以确保结果的科学质量并提供准确的解释。鉴于 ML 周围的复杂情况,明智地应用这些工具对于为营养流行病学提供新的分析资源,用于饮食测量和评估,以及用于建模饮食摄入及其与健康的关系的复杂性是必要的。本工作描述了 ML 在营养流行病学中的应用,并提供了避免将预测统计模型应用于营养数据时常见陷阱的指南。此外,它帮助不熟悉该领域的读者更好地评估他们的结果的重要性,并为营养流行病学中的 ML 领域提供了新的可能的未来方向。