Goldstein Neal D, LeVasseur Michael T, McClure Leslie A
Neal D. Goldstein is an assistant research professor, Michael T. LeVasseur is a visiting assistant teaching professor, and Leslie A. McClure is a professor and chair of the Department of Epidemiology and Biostatistics at the Drexel University Dornsife School of Public Health, Philadelphia, PA, USA.
Harv Data Sci Rev. 2020 Spring;2(2). doi: 10.1162/99608f92.9f0215e6. Epub 2020 Apr 30.
Epidemiology, biostatistics, and data science are broad disciplines that incorporate a variety of substantive areas. Common among them is a focus on quantitative approaches for solving intricate problems. When the substantive area is health and health care, the overlap is further cemented. Researchers in these disciplines are fluent in statistics, data management and analysis, and health and medicine, to name but a few competencies. Yet there are important and perhaps mutually exclusive attributes of these fields that warrant a tighter integration. For example, epidemiologists receive substantial training in the science of study design, measurement, and the art of causal inference. Biostatisticians are well versed in the theory and application of methodological techniques, as well as the design and conduct of public health research. Data scientists receive equivalently rigorous training in computational and visualization approaches for high-dimensional data. Compared to data scientists, epidemiologists and biostatisticians may have less expertise in computer science and informatics, while data scientists may benefit from a working knowledge of study design and causal inference. Collaboration and cross-training offer the opportunity to share and learn of the constructs, frameworks, theories, and methods of these fields with the goal of offering fresh and innovate perspectives for tackling challenging problems in health and health care. In this article, we first describe the evolution of these fields focusing on their convergence in the era of electronic health data, notably electronic medical records (EMRs). Next we present how a collaborative team may design, analyze, and implement an EMR-based study. Finally, we review the curricula at leading epidemiology, biostatistics, and data science training programs, identifying gaps and offering suggestions for the fields moving forward.
流行病学、生物统计学和数据科学是广泛的学科,涵盖了各种实质性领域。它们的共同之处在于注重用定量方法解决复杂问题。当实质性领域是健康与医疗保健时,这种重叠就进一步加深了。这些学科的研究人员精通统计学、数据管理与分析以及健康与医学等,仅举几个能力方面的例子。然而,这些领域存在重要且可能相互排斥的属性,需要更紧密的整合。例如,流行病学家在研究设计科学、测量以及因果推断艺术方面接受了大量培训。生物统计学家精通方法技术的理论与应用,以及公共卫生研究的设计与实施。数据科学家在高维数据的计算和可视化方法方面接受了同等严格的培训。与数据科学家相比,流行病学家和生物统计学家在计算机科学和信息学方面的专业知识可能较少,而数据科学家可能会从研究设计和因果推断的实用知识中受益。合作与交叉培训提供了分享和学习这些领域的构建、框架、理论和方法的机会,目标是为解决健康与医疗保健中的挑战性问题提供新颖和创新的视角。在本文中,我们首先描述这些领域的发展,重点关注它们在电子健康数据时代,特别是电子病历(EMR)时代的融合。接下来,我们展示一个合作团队如何设计、分析和实施一项基于电子病历的研究。最后,我们回顾了一流的流行病学、生物统计学和数据科学培训项目的课程,找出差距并为这些领域的未来发展提供建议。