Suppr超能文献

半监督迁移学习在模型分类性能评估中的应用。

Semisupervised transfer learning for evaluation of model classification performance.

机构信息

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, United States.

Division of Biostatistics, Department of Population Health Sciences, University of Utah, Salt Lake City, UT 84108, United States.

出版信息

Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujae002.

Abstract

In many modern machine learning applications, changes in covariate distributions and difficulty in acquiring outcome information have posed challenges to robust model training and evaluation. Numerous transfer learning methods have been developed to robustly adapt the model itself to some unlabeled target populations using existing labeled data in a source population. However, there is a paucity of literature on transferring performance metrics, especially receiver operating characteristic (ROC) parameters, of a trained model. In this paper, we aim to evaluate the performance of a trained binary classifier on unlabeled target population based on ROC analysis. We proposed Semisupervised Transfer lEarning of Accuracy Measures (STEAM), an efficient three-step estimation procedure that employs (1) double-index modeling to construct calibrated density ratio weights and (2) robust imputation to leverage the large amount of unlabeled data to improve estimation efficiency. We establish the consistency and asymptotic normality of the proposed estimator under the correct specification of either the density ratio model or the outcome model. We also correct for potential overfitting bias in the estimators in finite samples with cross-validation. We compare our proposed estimators to existing methods and show reductions in bias and gains in efficiency through simulations. We illustrate the practical utility of the proposed method on evaluating prediction performance of a phenotyping model for rheumatoid arthritis (RA) on a temporally evolving EHR cohort.

摘要

在许多现代机器学习应用中,协变量分布的变化和获取结果信息的困难给稳健的模型训练和评估带来了挑战。已经开发了许多迁移学习方法,以便使用源人群中的现有标记数据,稳健地将模型本身自适应到一些未标记的目标人群。然而,关于转移性能指标(尤其是接收器操作特性(ROC)参数)的文献很少。在本文中,我们旨在根据 ROC 分析评估在未标记目标人群中训练有素的二分类器的性能。我们提出了基于 ROC 分析的半监督迁移学习精度度量(STEAM),这是一种高效的三步估计过程,采用(1)双索引建模来构建校准的密度比权重,(2)稳健的插补,利用大量未标记的数据来提高估计效率。我们在密度比模型或结果模型的正确规范下建立了所提出估计器的一致性和渐近正态性。我们还通过交叉验证在有限样本中纠正估计器中的潜在过度拟合偏差。我们将我们提出的估计器与现有方法进行比较,并通过模拟显示出偏差的减少和效率的提高。我们在评估时间演变的 EHR 队列中用于类风湿关节炎(RA)表型模型的预测性能的实际实用程序上说明了所提出方法的实用性。

相似文献

1
Semisupervised transfer learning for evaluation of model classification performance.
Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujae002.
5
Machine Learning Did Not Outperform Conventional Competing Risk Modeling to Predict Revision Arthroplasty.
Clin Orthop Relat Res. 2024 Aug 1;482(8):1472-1482. doi: 10.1097/CORR.0000000000003018. Epub 2024 Mar 12.
6
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
7
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
8
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
9
Diagnostic test accuracy and cost-effectiveness of tests for codeletion of chromosomal arms 1p and 19q in people with glioma.
Cochrane Database Syst Rev. 2022 Mar 2;3(3):CD013387. doi: 10.1002/14651858.CD013387.pub2.

引用本文的文献

1
Bridging Data Gaps in Healthcare: A Scoping Review of Transfer Learning in Structured Data Analysis.
Health Data Sci. 2025 Sep 3;5:0321. doi: 10.34133/hds.0321. eCollection 2025.
2
A framework for evaluating clinical artificial intelligence systems without ground-truth annotations.
Nat Commun. 2024 Feb 28;15(1):1808. doi: 10.1038/s41467-024-46000-9.

本文引用的文献

1
Double/debiased machine learning for logistic partially linear model.
Econom J. 2021 Sep;24(3):559-588. doi: 10.1093/ectj/utab019. Epub 2021 Jun 11.
2
Estimating the area under the ROC curve when transporting a prediction model to a target population.
Biometrics. 2023 Sep;79(3):2382-2393. doi: 10.1111/biom.13796. Epub 2022 Nov 25.
3
Efficient Evaluation of Prediction Rules in Semi-Supervised Settings under Stratified Sampling.
J R Stat Soc Series B Stat Methodol. 2022 Sep;84(4):1353-1391. doi: 10.1111/rssb.12502. Epub 2022 Apr 26.
4
Transporting a Prediction Model for Use in a New Target Population.
Am J Epidemiol. 2023 Feb 1;192(2):296-304. doi: 10.1093/aje/kwac128.
5
Impact of ICD10 and secular changes on electronic medical record rheumatoid arthritis algorithms.
Rheumatology (Oxford). 2020 Dec 1;59(12):3759-3766. doi: 10.1093/rheumatology/keaa198.
6
Estimating average treatment effects with a double-index propensity score.
Biometrics. 2020 Sep;76(3):767-777. doi: 10.1111/biom.13195. Epub 2019 Dec 16.
9
Inaccuracy of ICD-9 Codes for Chronic Kidney Disease: A Study from Two Practice-based Research Networks (PBRNs).
J Am Board Fam Med. 2015 Sep-Oct;28(5):678-82. doi: 10.3122/jabfm.2015.05.140136.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验