Peek Niels, Rodrigues Pedro Pereira
1Division of Informatics, Imaging, and Data Science, School of Health Sciences, University of Manchester, Manchester, UK.
2NIHR Greater Manchester Patient Safety Translational Research Centre, University of Manchester, Manchester, UK.
Int J Data Sci Anal. 2018;6(3):261-269. doi: 10.1007/s41060-018-0109-y. Epub 2018 Mar 7.
The routine operation of modern healthcare systems produces a wealth of data in electronic health records, administrative databases, clinical registries, and other clinical systems. It is widely acknowledged that there is great potential for utilising these routine data for health research to derive new knowledge about health, disease, and treatments. However, the reuse of routine healthcare data for research is not beyond debate. In this paper, we discuss three issues that have stirred considerable controversy among health data scientists. First, we discuss van der Lei's 1st Law of Medical Informatics, which states that data shall be used only for the purpose for which they were collected. Then, we discuss to which extent routine data sources and innovations in analytical methods alleviate the need to conduct randomised clinical trials. Finally, we address questions of governance, privacy, and trust when routine health data are made available for research. While we don't think that there is a definite "right answer" for any of these issues, we argue that data scientists should be aware of the arguments for different viewpoints, respect their validity, and contribute constructively to the debate. The three controversies discussed in this paper relate to core challenges for research with health data and define an essential research agenda for the health data science community.
现代医疗保健系统的日常运作在电子健康记录、行政数据库、临床登记处及其他临床系统中产生了大量数据。人们普遍认为,利用这些常规数据进行健康研究以获取有关健康、疾病和治疗的新知识具有巨大潜力。然而,将常规医疗数据用于研究并非毫无争议。在本文中,我们讨论了在健康数据科学家之间引发了相当大争议的三个问题。首先,我们讨论范德雷的医学信息学第一定律,该定律指出数据仅应用于收集它们时的目的。然后,我们讨论常规数据源和分析方法的创新在多大程度上减轻了进行随机临床试验的必要性。最后,当常规健康数据用于研究时,我们探讨治理、隐私和信任问题。虽然我们认为这些问题中没有一个有确定的“正确答案”,但我们认为数据科学家应该了解不同观点的论据,尊重其合理性,并为这场辩论做出建设性贡献。本文讨论的这三个争议涉及健康数据研究的核心挑战,并为健康数据科学界确定了一个重要的研究议程。