• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基线变量缺失的重要性:以“所有人”研究计划为例。

Importance of missingness in baseline variables: A case study of the All of Us Research Program.

机构信息

Department of Internal Medicine, The Ohio State University, Columbus, Ohio, United States of America.

Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America.

出版信息

PLoS One. 2023 May 18;18(5):e0285848. doi: 10.1371/journal.pone.0285848. eCollection 2023.

DOI:10.1371/journal.pone.0285848
PMID:37200348
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10194909/
Abstract

OBJECTIVE

The All of Us Research Program collects data from multiple information sources, including health surveys, to build a national longitudinal research repository that researchers can use to advance precision medicine. Missing survey responses pose challenges to study conclusions. We describe missingness in All of Us baseline surveys.

STUDY DESIGN AND SETTING

We extracted survey responses between May 31, 2017, to September 30, 2020. Missing percentages for groups historically underrepresented in biomedical research were compared to represented groups. Associations of missing percentages with age, health literacy score, and survey completion date were evaluated. We used negative binomial regression to evaluate participant characteristics on the number of missed questions out of the total eligible questions for each participant.

RESULTS

The dataset analyzed contained data for 334,183 participants who submitted at least one baseline survey. Almost all (97.0%) of the participants completed all baseline surveys, and only 541 (0.2%) participants skipped all questions in at least one of the baseline surveys. The median skip rate was 5.0% of the questions, with an interquartile range (IQR) of 2.5% to 7.9%. Historically underrepresented groups were associated with higher missingness (incidence rate ratio (IRR) [95% CI]: 1.26 [1.25, 1.27] for Black/African American compared to White). Missing percentages were similar by survey completion date, participant age, and health literacy score. Skipping specific questions were associated with higher missingness (IRRs [95% CI]: 1.39 [1.38, 1.40] for skipping income, 1.92 [1.89, 1.95] for skipping education, 2.19 [2.09-2.30] for skipping sexual and gender questions).

CONCLUSION

Surveys in the All of Us Research Program will form an essential component of the data researchers can use to perform their analyses. Missingness was low in All of Us baseline surveys, but group differences exist. Additional statistical methods and careful analysis of surveys could help mitigate challenges to the validity of conclusions.

摘要

目的

“所有人”研究计划从多个信息来源(包括健康调查)收集数据,以构建一个国家纵向研究存储库,研究人员可以使用该存储库推进精准医学。调查回复缺失给研究结论带来了挑战。我们描述了“所有人”基线调查中的缺失情况。

研究设计和环境

我们提取了 2017 年 5 月 31 日至 2020 年 9 月 30 日之间的调查回复。比较了生物医学研究中代表性不足的群体与代表性群体的缺失百分比。评估了缺失百分比与年龄、健康素养得分和调查完成日期的关联。我们使用负二项回归评估参与者特征对每个参与者的总合格问题中错过问题的数量的影响。

结果

分析的数据集包含了至少提交了一份基线调查的 334183 名参与者的数据。几乎所有(97.0%)参与者都完成了所有的基线调查,只有 541 名(0.2%)参与者在至少一份基线调查中跳过了所有问题。中位数跳过率为问题的 5.0%,四分位距(IQR)为 2.5%至 7.9%。代表性不足的群体与较高的缺失率相关(发病率比(IRR)[95%CI]:与白人相比,黑人/非裔美国人 1.26[1.25,1.27])。根据调查完成日期、参与者年龄和健康素养得分,缺失百分比相似。跳过特定问题与较高的缺失率相关(跳过收入的 IRR[95%CI]:1.39[1.38,1.40],跳过教育的 IRR[95%CI]:1.92[1.89,1.95],跳过性和性别问题的 IRR[95%CI]:2.19[2.09-2.30])。

结论

“所有人”研究计划中的调查将成为研究人员进行分析的重要数据组成部分。“所有人”基线调查中的缺失率较低,但存在群体差异。额外的统计方法和对调查的仔细分析可以帮助减轻对结论有效性的挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a1bf/10194909/c5b4ac17a8fe/pone.0285848.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a1bf/10194909/27fc9b6512ae/pone.0285848.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a1bf/10194909/c5b4ac17a8fe/pone.0285848.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a1bf/10194909/27fc9b6512ae/pone.0285848.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a1bf/10194909/c5b4ac17a8fe/pone.0285848.g002.jpg

相似文献

1
Importance of missingness in baseline variables: A case study of the All of Us Research Program.基线变量缺失的重要性:以“所有人”研究计划为例。
PLoS One. 2023 May 18;18(5):e0285848. doi: 10.1371/journal.pone.0285848. eCollection 2023.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Will all youth answer sexual orientation and gender-related survey questions? An analysis of missingness in a large U.S. survey of adolescents and young adults.所有年轻人都会回答与性取向和性别相关的调查问题吗?对一项针对美国大量青少年和青年成年人的调查中数据缺失情况的分析。
Psychol Methods. 2024 Apr 4. doi: 10.1037/met0000652.
4
Baseline Characteristics of the 2015-2019 First Year Student Cohorts of the NIH Building Infrastructure Leading to Diversity (BUILD) Program.2015-2019 年 NIH 构建多元化基础设施领导计划(BUILD)项目第一学年学生队列的基线特征。
Ethn Dis. 2020 Sep 24;30(4):681-692. doi: 10.18865/ed.30.4.681. eCollection 2020 Fall.
5
Nominal Versus Realized Costs of Recruiting and Retaining a National Sample of Sexual Minority Adolescents in the United States: Longitudinal Study.美国招募和保留全国性少数群体青少年样本的名义成本与实际成本:纵向研究。
J Med Internet Res. 2023 Feb 2;25:e36764. doi: 10.2196/36764.
6
Factors associated with self-reported STDs: data from a national survey.与自我报告的性传播疾病相关的因素:一项全国性调查的数据
Sex Transm Dis. 1994 Nov-Dec;21(6):303-8. doi: 10.1097/00007435-199411000-00002.
7
Implications for Electronic Surveys in Inpatient Settings Based on Patient Survey Response Patterns: Cross-Sectional Study.基于患者调查回复模式的住院环境下电子调查的影响:横断面研究。
J Med Internet Res. 2023 Nov 1;25:e48236. doi: 10.2196/48236.
8
Cardiovascular Health Disparities in Racial and Other Underrepresented Groups: Initial Results From the All of Us Research Program.心血管健康在不同种族和其他代表性不足群体中的差异:“所有人”研究计划的初步结果。
J Am Heart Assoc. 2021 Sep 7;10(17):e021724. doi: 10.1161/JAHA.121.021724. Epub 2021 Aug 25.
9
Achieving a Representative Sample of Asian Americans in Biomedical Research Through Community-Based Approaches: Comparing Demographic Data in the Research Program With the American Community Survey.通过基于社区的方法在生物医学研究中获取具有代表性的亚裔美国人样本:比较研究项目中的人口数据与美国社区调查数据。
J Transcult Nurs. 2023 Jan;34(1):59-67. doi: 10.1177/10436596221130796. Epub 2022 Nov 18.
10
Multiple imputation of missing data with skip-pattern covariates: a comparison of alternative strategies.带有跳跃模式协变量的缺失数据多重填补:替代策略比较
J Stat Comput Simul. 2023;94(7):1543-1570. doi: 10.1080/00949655.2023.2293124.

引用本文的文献

1
Characterizing individual and methodological risk factors for survey non-completion using machine learning: findings from the U.S. Millennium Cohort Study.使用机器学习确定调查未完成的个体和方法学风险因素:来自美国千年队列研究的结果。
BMC Med Res Methodol. 2025 Jul 14;25(1):174. doi: 10.1186/s12874-025-02620-3.
2
Reporting of Sociodemographic and Clinical Characteristics in US-Based Randomized Clinical Trials of Deprescribing Interventions for Older Adults.美国针对老年人减药干预的随机临床试验中社会人口学和临床特征的报告
Drugs Aging. 2025 Jul 5. doi: 10.1007/s40266-025-01226-0.
3
Unmet social needs and diverticulitis: a phenotyping algorithm and cross-sectional analysis.

本文引用的文献

1
Assessing consent for and response to health survey components in an era of falling response rates: National Health and Nutrition Examination Survey, 2011-2018.在回复率不断下降的时代评估对健康调查组成部分的同意情况及回复情况:2011 - 2018年国家健康与营养检查调查
Surv Res Methods. 2021 Aug 19;15(3):257-268. doi: 10.18148/srm/2021.v15i3.7774.
2
Physical measures and biomarker collection in health surveys: Propensity to participate.健康调查中的身体测量和生物标志物采集:参与倾向。
Res Social Adm Pharm. 2021 May;17(5):921-929. doi: 10.1016/j.sapharm.2020.07.025. Epub 2020 Aug 5.
3
Diversity and inclusion for the All of Us research program: A scoping review.
未满足的社会需求与憩室炎:一种表型分析算法及横断面分析
J Am Med Inform Assoc. 2025 May 1;32(5):866-875. doi: 10.1093/jamia/ocae238.
《全民研究计划的多样性和包容性:范围综述》。
PLoS One. 2020 Jul 1;15(7):e0234962. doi: 10.1371/journal.pone.0234962. eCollection 2020.
4
The "All of Us" Research Program.“All of Us”研究计划。
N Engl J Med. 2019 Aug 15;381(7):668-676. doi: 10.1056/NEJMsr1809937.
5
Development of the Initial Surveys for the All of Us Research Program.全美国研究计划初始调查问卷的制定。
Epidemiology. 2019 Jul;30(4):597-608. doi: 10.1097/EDE.0000000000001028.
6
How handling missing data may impact conclusions: A comparison of six different imputation methods for categorical questionnaire data.处理缺失数据如何影响结论:六种不同的分类问卷数据插补方法的比较
SAGE Open Med. 2019 Jan 8;7:2050312118822912. doi: 10.1177/2050312118822912. eCollection 2019.
7
Psychometric properties of the brief health literacy screen in clinical practice.临床实践中简短健康素养筛查的心理测量学特性。
J Gen Intern Med. 2014 Jan;29(1):119-26. doi: 10.1007/s11606-013-2568-0. Epub 2013 Aug 6.
8
Combining multiple imputation and inverse-probability weighting.结合多重填补法和逆概率加权法。
Biometrics. 2012 Mar;68(1):129-37. doi: 10.1111/j.1541-0420.2011.01666.x. Epub 2011 Nov 3.
9
Review of inverse probability weighting for dealing with missing data.逆概率加权法处理缺失数据的综述。
Stat Methods Med Res. 2013 Jun;22(3):278-95. doi: 10.1177/0962280210395740. Epub 2011 Jan 10.
10
Development of physical and mental health summary scores from the patient-reported outcomes measurement information system (PROMIS) global items.基于患者报告结局测量信息系统(PROMIS)全局项目制定身心健康综合评分。
Qual Life Res. 2009 Sep;18(7):873-80. doi: 10.1007/s11136-009-9496-9. Epub 2009 Jun 19.