Suppr超能文献

针对存在结局和协变量测量误差的两阶段研究的高效半参数推断。

Efficient semiparametric inference for two-phase studies with outcome and covariate measurement errors.

作者信息

Tao Ran, Lotspeich Sarah C, Amorim Gustavo, Shaw Pamela A, Shepherd Bryan E

机构信息

Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.

Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, USA.

出版信息

Stat Med. 2021 Feb 10;40(3):725-738. doi: 10.1002/sim.8799. Epub 2020 Nov 3.

Abstract

In modern observational studies using electronic health records or other routinely collected data, both the outcome and covariates of interest can be error-prone and their errors often correlated. A cost-effective solution is the two-phase design, under which the error-prone outcome and covariates are observed for all subjects during the first phase and that information is used to select a validation subsample for accurate measurements of these variables in the second phase. Previous research on two-phase measurement error problems largely focused on scenarios where there are errors in covariates only or the validation sample is a simple random sample of study subjects. Herein, we propose a semiparametric approach to general two-phase measurement error problems with a quantitative outcome, allowing for correlated errors in the outcome and covariates and arbitrary second-phase selection. We devise a computationally efficient and numerically stable expectation-maximization algorithm to maximize the nonparametric likelihood function. The resulting estimators possess desired statistical properties. We demonstrate the superiority of the proposed methods over existing approaches through extensive simulation studies, and we illustrate their use in an observational HIV study.

摘要

在使用电子健康记录或其他常规收集数据的现代观察性研究中,感兴趣的结局和协变量都可能容易出错,并且它们的误差通常相互关联。一种具有成本效益的解决方案是两阶段设计,在该设计中,在第一阶段对所有受试者观察容易出错的结局和协变量,并使用该信息选择一个验证子样本,以便在第二阶段对这些变量进行准确测量。先前关于两阶段测量误差问题的研究主要集中在仅协变量存在误差或验证样本是研究对象的简单随机样本的情形。在此,我们提出一种半参数方法来解决具有定量结局的一般两阶段测量误差问题,允许结局和协变量中的误差相关以及任意的第二阶段选择。我们设计了一种计算高效且数值稳定的期望最大化算法来最大化非参数似然函数。所得估计量具有所需的统计性质。我们通过广泛的模拟研究证明了所提出方法相对于现有方法的优越性,并说明了它们在一项观察性HIV研究中的应用。

相似文献

1
Efficient semiparametric inference for two-phase studies with outcome and covariate measurement errors.
Stat Med. 2021 Feb 10;40(3):725-738. doi: 10.1002/sim.8799. Epub 2020 Nov 3.
3
Efficient Semiparametric Inference Under Two-Phase Sampling, With Applications to Genetic Association Studies.
J Am Stat Assoc. 2017;112(520):1468-1476. doi: 10.1080/01621459.2017.1295864. Epub 2017 Feb 28.
4
On computation of semiparametric maximum likelihood estimators with shape constraints.
Biometrics. 2021 Mar;77(1):113-124. doi: 10.1111/biom.13266. Epub 2020 Apr 27.
5
A semiparametric method for evaluating causal effects in the presence of error-prone covariates.
Biom J. 2021 Aug;63(6):1202-1222. doi: 10.1002/bimj.202000069. Epub 2021 Apr 21.
6
Efficient Estimation of Semiparametric Transformation Models for Two-Phase Cohort Studies.
J Am Stat Assoc. 2014 Jan 1;109(505):371-383. doi: 10.1080/01621459.2013.842172.
7
Variable selection for covariate-adjusted semiparametric inference in randomized clinical trials.
Stat Med. 2012 Dec 20;31(29):3789-804. doi: 10.1002/sim.5433. Epub 2012 Jun 26.
8
A semiparametric joint model for cluster size and subunit-specific interval-censored outcomes.
Biometrics. 2023 Sep;79(3):2010-2022. doi: 10.1111/biom.13795. Epub 2022 Dec 15.
10
Cox regression for mixed case interval-censored data with covariate errors.
Lifetime Data Anal. 2012 Jul;18(3):321-38. doi: 10.1007/s10985-012-9220-x. Epub 2012 Mar 24.

引用本文的文献

2
Study design features increase replicability in brain-wide association studies.
Nature. 2024 Dec;636(8043):719-727. doi: 10.1038/s41586-024-08260-9. Epub 2024 Nov 27.
3
Applying survey weights to ordinal regression models for improved inference in outcome-dependent samples with ordinal outcomes.
Stat Methods Med Res. 2024 Nov;33(11-12):2007-2026. doi: 10.1177/09622802241282091. Epub 2024 Oct 23.
5
Lessons learned from over a decade of data audits in international observational HIV cohorts in Latin America and East Africa.
J Clin Transl Sci. 2023 Nov 3;7(1):e245. doi: 10.1017/cts.2023.659. eCollection 2023.
7
Errors in multiple variables in human immunodeficiency virus (HIV) cohort and electronic health record data: statistical challenges and opportunities.
Stat Commun Infect Dis. 2020 Oct 7;12(Suppl1):20190015. doi: 10.1515/scid-2019-0015. eCollection 2020 Sep 1.

本文引用的文献

1
Optimal Designs of Two-Phase Studies.
J Am Stat Assoc. 2020;115(532):1946-1959. doi: 10.1080/01621459.2019.1671200. Epub 2019 Oct 29.
2
Efficient Semiparametric Inference Under Two-Phase Sampling, With Applications to Genetic Association Studies.
J Am Stat Assoc. 2017;112(520):1468-1476. doi: 10.1080/01621459.2017.1295864. Epub 2017 Feb 28.
3
Evolution of HIV treatment guidelines in high- and low-income countries: converging recommendations.
Antiviral Res. 2014 Mar;103:88-93. doi: 10.1016/j.antiviral.2013.12.007. Epub 2013 Dec 25.
4
Quantitative trait analysis in sequencing studies under trait-dependent sampling.
Proc Natl Acad Sci U S A. 2013 Jul 23;110(30):12247-52. doi: 10.1073/pnas.1221713110. Epub 2013 Jul 11.
5
Using audit information to adjust parameter estimates for data errors in clinical trials.
Clin Trials. 2012 Dec;9(6):721-9. doi: 10.1177/1740774512450100. Epub 2012 Jul 30.
6
Measuring the quality of observational study data in an international HIV research network.
PLoS One. 2012;7(4):e33908. doi: 10.1371/journal.pone.0033908. Epub 2012 Apr 6.
7
Improving public health information: a data quality intervention in KwaZulu-Natal, South Africa.
Bull World Health Organ. 2012 Mar 1;90(3):176-82. doi: 10.2471/BLT.11.092759. Epub 2011 Dec 5.
8
Accounting for data errors discovered from an audit in multiple linear regression.
Biometrics. 2011 Sep;67(3):1083-91. doi: 10.1111/j.1541-0420.2010.01543.x. Epub 2011 Jan 31.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验