Department of Therapeutic Radiology, Yale School of Medicine, New Haven, CT, USA.
Center for Outcomes Research and Evaluation at Yale, New Haven, CT, USA.
Yearb Med Inform. 2023 Aug;32(1):104-110. doi: 10.1055/s-0043-1768721. Epub 2023 Jul 6.
Despite growing enthusiasm surrounding the utility of clinical informatics to improve cancer outcomes, data availability remains a persistent bottleneck to progress. Difficulty combining data with protected health information often limits our ability to aggregate larger more representative datasets for analysis. With the rise of machine learning techniques that require increasing amounts of clinical data, these barriers have magnified. Here, we review recent efforts within clinical informatics to address issues related to safely sharing cancer data.
We carried out a narrative review of clinical informatics studies related to sharing protected health data within cancer studies published from 2018-2022, with a focus on domains such as decentralized analytics, homomorphic encryption, and common data models.
Clinical informatics studies that investigated cancer data sharing were identified. A particular focus of the search yielded studies on decentralized analytics, homomorphic encryption, and common data models. Decentralized analytics has been prototyped across genomic, imaging, and clinical data with the most advances in diagnostic image analysis. Homomorphic encryption was most often employed on genomic data and less on imaging and clinical data. Common data models primarily involve clinical data from the electronic health record. Although all methods have robust research, there are limited studies showing wide scale implementation.
Decentralized analytics, homomorphic encryption, and common data models represent promising solutions to improve cancer data sharing. Promising results thus far have been limited to smaller settings. Future studies should be focused on evaluating the scalability and efficacy of these methods across clinical settings of varying resources and expertise.
尽管临床信息学在改善癌症结果方面的应用越来越受到关注,但数据可用性仍然是进展的持续瓶颈。将数据与受保护的健康信息相结合的困难常常限制了我们为分析而聚合更大、更具代表性的数据集的能力。随着需要越来越多临床数据的机器学习技术的兴起,这些障碍加剧了。在这里,我们回顾了临床信息学中最近为解决与安全共享癌症数据相关的问题所做的努力。
我们对 2018-2022 年期间发表的与癌症研究中共享受保护健康数据相关的临床信息学研究进行了叙述性综述,重点关注分散式分析、同态加密和通用数据模型等领域。
确定了调查癌症数据共享的临床信息学研究。搜索的一个特别重点是关于分散式分析、同态加密和通用数据模型的研究。分散式分析已经在基因组、成像和临床数据中得到了原型开发,在诊断图像分析方面取得了最大的进展。同态加密最常用于基因组数据,而在成像和临床数据中使用较少。通用数据模型主要涉及电子健康记录中的临床数据。尽管所有方法都有强大的研究,但很少有研究表明广泛的实施。
分散式分析、同态加密和通用数据模型是改善癌症数据共享的有前途的解决方案。迄今为止,有希望的结果仅限于较小的环境。未来的研究应集中于评估这些方法在资源和专业知识各不相同的临床环境中的可扩展性和效果。