Piao Jin, Ning Jing, Chambers Christina D, Xu Ronghui
Department of Biostatistics, The University of Texas School of Public Health, 1200 Pressler Street, Houston, TX 77030, USA.
Department of Biostatistics, The University of Texas MD Anderson Cancer Center, 1400 Pressler St, Houston, TX 77030, USA
Biostatistics. 2018 Jan 1;19(1):54-70. doi: 10.1093/biostatistics/kxx024.
Evaluating and understanding the risk and safety of using medications for autoimmune disease in a woman during her pregnancy will help both clinicians and pregnant women to make better treatment decisions. However, utilizing spontaneous abortion (SAB) data collected in observational studies of pregnancy to derive valid inference poses two major challenges. First, the data from the observational cohort are not random samples of the target population due to the sampling mechanism. Pregnant women with early SAB are more likely to be excluded from the cohort, and there may be substantial differences between the observed SAB time and those in the target population. Second, the observed data are heterogeneous and contain a "cured" proportion. In this article, we consider semiparametric models to simultaneously estimate the probability of being cured and the distribution of time to SAB for the uncured subgroup. To derive the maximum likelihood estimators, we appropriately adjust the sampling bias in the likelihood function and develop an expectation-maximization algorithm to overcome the computational challenge. We apply the empirical process theory to prove the consistency and asymptotic normality of the estimators. We examine the finite sample performance of the proposed estimators in simulation studies and illustrate the proposed method through an application to SAB data from pregnant women.
评估和了解女性在孕期使用自身免疫性疾病药物的风险和安全性,将有助于临床医生和孕妇做出更好的治疗决策。然而,利用在妊娠观察性研究中收集的自然流产(SAB)数据得出有效推断存在两个主要挑战。首先,由于抽样机制,观察性队列的数据并非目标人群的随机样本。早期发生SAB的孕妇更有可能被排除在队列之外,并且观察到的SAB时间与目标人群中的时间可能存在实质性差异。其次,观察到的数据具有异质性且包含一个“治愈”比例。在本文中,我们考虑使用半参数模型来同时估计治愈概率以及未治愈亚组的SAB时间分布。为了推导最大似然估计量,我们在似然函数中适当调整抽样偏差,并开发一种期望最大化算法来克服计算挑战。我们应用经验过程理论来证明估计量的一致性和渐近正态性。我们在模拟研究中检验所提出估计量的有限样本性能,并通过应用于孕妇的SAB数据来说明所提出的方法。