Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, United States.
Division of Biostatistics, Department of Population Health Sciences, University of Utah, Salt Lake City, UT 84108, United States.
Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujae002.
In many modern machine learning applications, changes in covariate distributions and difficulty in acquiring outcome information have posed challenges to robust model training and evaluation. Numerous transfer learning methods have been developed to robustly adapt the model itself to some unlabeled target populations using existing labeled data in a source population. However, there is a paucity of literature on transferring performance metrics, especially receiver operating characteristic (ROC) parameters, of a trained model. In this paper, we aim to evaluate the performance of a trained binary classifier on unlabeled target population based on ROC analysis. We proposed Semisupervised Transfer lEarning of Accuracy Measures (STEAM), an efficient three-step estimation procedure that employs (1) double-index modeling to construct calibrated density ratio weights and (2) robust imputation to leverage the large amount of unlabeled data to improve estimation efficiency. We establish the consistency and asymptotic normality of the proposed estimator under the correct specification of either the density ratio model or the outcome model. We also correct for potential overfitting bias in the estimators in finite samples with cross-validation. We compare our proposed estimators to existing methods and show reductions in bias and gains in efficiency through simulations. We illustrate the practical utility of the proposed method on evaluating prediction performance of a phenotyping model for rheumatoid arthritis (RA) on a temporally evolving EHR cohort.
在许多现代机器学习应用中,协变量分布的变化和获取结果信息的困难给稳健的模型训练和评估带来了挑战。已经开发了许多迁移学习方法,以便使用源人群中的现有标记数据,稳健地将模型本身自适应到一些未标记的目标人群。然而,关于转移性能指标(尤其是接收器操作特性(ROC)参数)的文献很少。在本文中,我们旨在根据 ROC 分析评估在未标记目标人群中训练有素的二分类器的性能。我们提出了基于 ROC 分析的半监督迁移学习精度度量(STEAM),这是一种高效的三步估计过程,采用(1)双索引建模来构建校准的密度比权重,(2)稳健的插补,利用大量未标记的数据来提高估计效率。我们在密度比模型或结果模型的正确规范下建立了所提出估计器的一致性和渐近正态性。我们还通过交叉验证在有限样本中纠正估计器中的潜在过度拟合偏差。我们将我们提出的估计器与现有方法进行比较,并通过模拟显示出偏差的减少和效率的提高。我们在评估时间演变的 EHR 队列中用于类风湿关节炎(RA)表型模型的预测性能的实际实用程序上说明了所提出方法的实用性。