Center for Health Outcomes Research, Saint Louis University, Saint Louis, Missouri 63104, USA; email:
Department of Computer Science, Bellarmine University, Louisville, Kentucky 40205, USA; email:
Annu Rev Public Health. 2020 Apr 2;41:21-36. doi: 10.1146/annurev-publhealth-040119-094437. Epub 2019 Oct 2.
Machine learning approaches to modeling of epidemiologic data are becoming increasingly more prevalent in the literature. These methods have the potential to improve our understanding of health and opportunities for intervention, far beyond our past capabilities. This article provides a walkthrough for creating supervised machine learning models with current examples from the literature. From identifying an appropriate sample and selecting features through training, testing, and assessing performance, the end-to-end approach to machine learning can be a daunting task. We take the reader through each step in the process and discuss novel concepts in the area of machine learning, including identifying treatment effects and explaining the output from machine learning models.
机器学习方法在流行病学数据建模中的应用在文献中越来越普遍。这些方法有可能极大地提高我们对健康的认识和干预机会,远超我们过去的能力。本文通过当前文献中的实例,提供了创建监督机器学习模型的指南。从确定合适的样本和选择特征,到训练、测试和评估性能,机器学习的端到端方法可能是一项艰巨的任务。我们将读者逐步引导到该过程的每个步骤,并讨论机器学习领域的新概念,包括识别治疗效果和解释机器学习模型的输出。