Department of Computer Science and Statistics, University of Rhode Island, South Kingstown, Rhode Island, USA.
Health Outcomes, Department of Pharmacy Practice, College of Pharmacy, University of Rhode Island, South Kingstown, Rhode Island, USA.
Pharm Stat. 2022 Nov;21(6):1199-1218. doi: 10.1002/pst.2225. Epub 2022 May 10.
Health administrative data are oftentimes of limited use in epidemiological study on drug safety in pregnancy, due to lacking information on gestational age at birth (GAB). Although several studies have proposed algorithms to estimate GAB using claims database, failing to incorporate the unique distributional shape of GAB, can introduce bias in estimates and subsequent modeling. Hence, we develop a Bayesian latent class model to predict GAB. The model employs a mixture of Gaussian distributions with linear covariates within each class. This approach allows modeling heterogeneity in the population by identifying latent subgroups and estimating class-specific regression coefficients. We fit this model in a Bayesian framework conducting posterior computation with Markov Chain Monte Carlo methods. The method is illustrated with a dataset of 10,043 Rhode Island Medicaid mother-child pairs. We found that the three-class and six-class mixture specifications maximized prediction accuracy. Based on our results, Medicaid women were partitioned into three classes, featured by extreme preterm or preterm birth, preterm or" early" term birth, and" late" term birth. Obstetrical complications appeared to pose a significant influence on class-membership. Altogether, compared to traditional linear models our approach shows an advantage in predictive accuracy, because of superior flexibility in modeling a skewed response and population heterogeneity.
健康行政数据在药物安全性的流行病学研究中通常用途有限,因为缺乏关于出生时的孕龄(GAB)的信息。尽管有几项研究已经提出了使用索赔数据库来估计 GAB 的算法,但未能结合 GAB 的独特分布形状,可能会导致估计值和后续建模出现偏差。因此,我们开发了一种贝叶斯潜在类别模型来预测 GAB。该模型在每个类别中使用具有线性协变量的高斯混合分布。这种方法通过识别潜在的亚组并估计特定于类别的回归系数,来对人群中的异质性进行建模。我们以 10043 对罗德岛州医疗补助计划的母婴对数据集为例,在贝叶斯框架中拟合此模型,并使用马尔可夫链蒙特卡罗方法进行后验计算。我们发现,三类别和六类别混合规范最大限度地提高了预测准确性。根据我们的结果,医疗补助计划的女性被分为三个类别,具有极端早产或早产、早产或“早期”足月产和“晚期”足月产的特点。产科并发症似乎对类别成员的归属有重大影响。总的来说,与传统的线性模型相比,我们的方法在预测准确性方面具有优势,因为它在对偏态响应和人群异质性进行建模方面具有更好的灵活性。