Toga Arthur W, Phatak Mukta, Pappas Ioannis, Thompson Simon, McHugh Caitlin P, Clement Matthew H S, Bauermeister Sarah, Maruyama Tetsuyuki, Gallacher John
Laboratory of Neuro Imaging, USC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA, United States.
Alzheimer's Disease Data Initiative, Kirkland, WA, United States.
Front Neuroinform. 2023 May 25;17:1175689. doi: 10.3389/fninf.2023.1175689. eCollection 2023.
There is common consensus that data sharing accelerates science. Data sharing enhances the utility of data and promotes the creation and competition of scientific ideas. Within the Alzheimer's disease and related dementias (ADRD) community, data types and modalities are spread across many organizations, geographies, and governance structures. The ADRD community is not alone in facing these challenges, however, the problem is even more difficult because of the need to share complex biomarker data from centers around the world. Heavy-handed data sharing mandates have, to date, been met with limited success and often outright resistance. Interest in making data Findable, Accessible, Interoperable, and Reusable (FAIR) has often resulted in centralized platforms. However, when data governance and sovereignty structures do not allow the movement of data, other methods, such as federation, must be pursued. Implementation of fully federated data approaches are not without their challenges. The user experience may become more complicated, and federated analysis of unstructured data types remains challenging. Advancement in federated data sharing should be accompanied by improvement in federated learning methodologies so that federated data sharing becomes functionally equivalent to direct access to record level data. In this article, we discuss federated data sharing approaches implemented by three data platforms in the ADRD field: Dementia's Platform UK (DPUK) in 2014, the Global Alzheimer's Association Interactive Network (GAAIN) in 2012, and the Alzheimer's Disease Data Initiative (ADDI) in 2020. We conclude by addressing open questions that the research community needs to solve together.
人们普遍认为数据共享能加速科学发展。数据共享提高了数据的效用,促进了科学思想的产生和竞争。在阿尔茨海默病及相关痴呆症(ADRD)领域,数据类型和模式分散在许多组织、地区和治理结构中。ADRD领域并非唯一面临这些挑战的领域,然而,由于需要共享来自世界各地中心的复杂生物标志物数据,问题变得更加棘手。迄今为止,严厉的数据共享指令成效有限,且常常遭到直接抵制。使数据具备可查找、可访问、互操作和可重用(FAIR)特性的理念往往催生了集中式平台。然而,当数据治理和主权结构不允许数据移动时,就必须采用其他方法,如联合。全面实施联合数据方法并非没有挑战。用户体验可能会变得更加复杂,对非结构化数据类型的联合分析仍然具有挑战性。联合数据共享的进步应伴随着联合学习方法的改进,以便联合数据共享在功能上等同于直接访问记录级数据。在本文中,我们讨论了ADRD领域三个数据平台实施的联合数据共享方法:2014年的英国痴呆症平台(DPUK)、2012年的全球阿尔茨海默病协会互动网络(GAAIN)以及2020年的阿尔茨海默病数据倡议(ADDI)。我们通过解决研究界需要共同解决的开放性问题来得出结论。