Department of Public Health Sciences, University of Chicago, Chicago, Illinois, United States of America.
Institute for Population and Precision Health, University of Chicago, Chicago, Illinois, United States of America.
PLoS One. 2022 Sep 1;17(9):e0272522. doi: 10.1371/journal.pone.0272522. eCollection 2022.
The NIH All of Us Research Program will have the scale and scope to enable research for a wide range of diseases, including cancer. The program's focus on diversity and inclusion promises a better understanding of the unequal burden of cancer. Preliminary cancer ascertainment in the All of Us cohort from two data sources (self-reported versus electronic health records (EHR)) is considered.
This work was performed on data collected from the All of Us Research Program's 315,297 enrolled participants to date using the Researcher Workbench, where approved researchers can access and analyze All of Us data on cancer and other diseases. Cancer case ascertainment was performed using data from EHR and self-reported surveys across key factors. Distribution of cancer types and concordance of data sources by cancer site and demographics is analyzed.
Data collected from 315,297 participants resulted in 13,298 cancer cases detected in the survey (in 89,261 participants), 23,520 cancer cases detected in the EHR (in 203,813 participants), and 7,123 cancer cases detected across both sources (in 62,497 participants). Key differences in survey completion by race/ethnicity impacted the makeup of cohorts when compared to cancer in the EHR and national NCI SEER data.
This study provides key insight into cancer detection in the All of Us Research Program and points to the existing strengths and limitations of All of Us as a platform for cancer research now and in the future.
NIH 所有美国人研究计划将具有规模和范围,能够针对包括癌症在内的各种疾病进行研究。该计划专注于多样性和包容性,有望更好地了解癌症负担的不平等。本研究考虑了从两个数据源(自我报告与电子健康记录(EHR))对所有美国人队列中的初步癌症确定。
这项工作是使用 Researcher Workbench 对迄今为止从所有美国人研究计划的 315297 名已注册参与者中收集的数据进行的,经过批准的研究人员可以在该工作台上访问和分析所有与癌症和其他疾病相关的美国人数据。通过 EHR 和自我报告的调查,针对关键因素确定癌症病例。分析了癌症类型的分布和按癌症部位和人口统计学数据来源的一致性。
从 315297 名参与者中收集的数据导致在调查中检测到 13298 例癌症病例(在 89261 名参与者中),在 EHR 中检测到 23520 例癌症病例(在 203813 名参与者中),以及在两个来源中检测到 7123 例癌症病例(在 62497 名参与者中)。与 EHR 和全国 NCI SEER 数据中的癌症相比,种族/族裔的调查完成情况的关键差异影响了队列的构成。
本研究提供了在所有美国人研究计划中进行癌症检测的关键见解,并指出了作为癌症研究平台的所有美国人现有的优势和局限性,无论是现在还是未来。