Smith Louisa H, Cavanaugh Robert
Roux Institute, Northeastern University, Portland, ME 04101, United States.
Department of Public Health and Health Sciences, Bouvé College of Health Sciences, Northeastern University, Boston, MA 02115, United States.
J Am Med Inform Assoc. 2024 Dec 1;31(12):3013-3021. doi: 10.1093/jamia/ocae198.
Despite easy-to-use tools like the Cohort Builder, using All of Us Research Program data for complex research questions requires a relatively high level of technical expertise. We aimed to increase research and training capacity and reduce barriers to entry for the All of Us community through an R package, allofus. In this article, we describe functions that address common challenges we encountered while working with All of Us Research Program data, and we demonstrate this functionality with an example of creating a cohort of All of Us participants by synthesizing electronic health record and survey data with time dependencies.
All of Us Research Program data are widely available to health researchers. The allofus R package is aimed at a wide range of researchers who wish to conduct complex analyses using best practices for reproducibility and transparency, and who have a range of experience using R. Because the All of Us data are transformed into the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM), researchers familiar with existing OMOP CDM tools or who wish to conduct network studies in conjunction with other OMOP CDM data will also find value in the package.
We developed an initial set of functions that solve problems we experienced across survey and electronic health record data in our own research and in mentoring student projects. The package will continue to grow and develop with the All of Us Research Program. The allofus R package can help build community research capacity by increasing access to the All of Us Research Program data, the efficiency of its use, and the rigor and reproducibility of the resulting research.
尽管有像队列构建器这样易于使用的工具,但使用“我们所有人”研究计划的数据来解决复杂的研究问题需要相对较高的技术专业知识。我们旨在通过一个R包“allofus”来提高研究和培训能力,并减少“我们所有人”社区的准入障碍。在本文中,我们描述了在处理“我们所有人”研究计划数据时遇到的常见挑战的解决函数,并通过一个将电子健康记录和调查数据与时间依赖性相结合来创建“我们所有人”参与者队列的示例来展示此功能。
“我们所有人”研究计划的数据广泛提供给健康研究人员。“allofus”R包面向广泛的研究人员,他们希望使用最佳实践进行可重复性和透明度高的复杂分析,并且有使用R的一系列经验。由于“我们所有人”的数据被转换为观察性医疗结果伙伴关系通用数据模型(OMOP CDM),熟悉现有OMOP CDM工具或希望结合其他OMOP CDM数据进行网络研究的研究人员也将在该包中找到价值。
我们开发了一组初始函数,用于解决我们在自己的研究以及指导学生项目时在调查和电子健康记录数据中遇到的问题。该包将随着“我们所有人”研究计划不断发展。“allofus”R包可以通过增加对“我们所有人”研究计划数据的访问、提高其使用效率以及提高所得研究的严谨性和可重复性来帮助建立社区研究能力。