Suppr超能文献

使用集成学习的约束二元分类:在具有成本效益的针对性暴露前预防策略中的应用

Constrained binary classification using ensemble learning: an application to cost-efficient targeted PrEP strategies.

作者信息

Zheng Wenjing, Balzer Laura, van der Laan Mark, Petersen Maya

机构信息

Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, U.S.A.

Department of Biostatistics, Havard T.H. Chan School of Public Health, Boston, MA, U.S.A.

出版信息

Stat Med. 2018 Jan 30;37(2):261-279. doi: 10.1002/sim.7296. Epub 2017 Apr 6.

Abstract

Binary classification problems are ubiquitous in health and social sciences. In many cases, one wishes to balance two competing optimality considerations for a binary classifier. For instance, in resource-limited settings, an human immunodeficiency virus prevention program based on offering pre-exposure prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program. In this article, we consider a general class of constrained binary classification problems wherein the objective function and the constraint are both monotonic with respect to a threshold. These include the minimization of the rate of positive predictions subject to a minimum sensitivity, the maximization of sensitivity subject to a maximum rate of positive predictions, and the Neyman-Pearson paradigm, which minimizes the type II error subject to an upper bound on the type I error. We propose an ensemble approach to these binary classification problems based on the Super Learner methodology. This approach linearly combines a user-supplied library of scoring algorithms, with combination weights and a discriminating threshold chosen to minimize the constrained optimality criterion. We then illustrate the application of the proposed classifier to develop an individualized PrEP targeting strategy in a resource-limited setting, with the goal of minimizing the number of PrEP offerings while achieving a minimum required sensitivity. This proof of concept data analysis uses baseline data from the ongoing Sustainable East Africa Research in Community Health study. Copyright © 2017 John Wiley & Sons, Ltd.

摘要

二元分类问题在健康和社会科学中无处不在。在许多情况下,人们希望在二元分类器的两个相互竞争的最优性考量之间取得平衡。例如,在资源有限的环境中,一个基于为选定的高危个体提供暴露前预防(PrEP)的人类免疫缺陷病毒预防项目,必须在二元分类器检测未来血清转化者的敏感性(从而为他们提供PrEP方案)与该项目在财务和后勤上可行的PrEP方案总数之间取得平衡。在本文中,我们考虑一类一般的约束二元分类问题,其中目标函数和约束对于一个阈值都是单调的。这些问题包括在最小敏感性约束下最小化阳性预测率、在最大阳性预测率约束下最大化敏感性,以及奈曼 - 皮尔逊范式,即在I型错误有上限的情况下最小化II型错误。我们基于超级学习器方法为这些二元分类问题提出一种集成方法。这种方法将用户提供的评分算法库进行线性组合,并选择组合权重和判别阈值以最小化约束最优性准则。然后,我们展示了所提出的分类器在资源有限环境中开发个性化PrEP靶向策略的应用,目标是在实现最低要求敏感性的同时最小化PrEP的提供数量。这个概念验证数据分析使用了正在进行的东非社区健康可持续研究的基线数据。版权所有© 2017约翰威立父子有限公司。

相似文献

3
Stacked generalization: an introduction to super learning.堆叠泛化:超级学习导论。
Eur J Epidemiol. 2018 May;33(5):459-464. doi: 10.1007/s10654-018-0390-z. Epub 2018 Apr 10.
5
Online cross-validation-based ensemble learning.基于在线交叉验证的集成学习。
Stat Med. 2018 Jan 30;37(2):249-260. doi: 10.1002/sim.7320. Epub 2017 May 4.

引用本文的文献

1
Predictors of HIV seroconversion in Botswana.博茨瓦纳艾滋病病毒血清转化的预测因素。
AIDS. 2025 Mar 1;39(3):290-297. doi: 10.1097/QAD.0000000000004055. Epub 2024 Nov 4.
5
The role of machine learning in HIV risk prediction.机器学习在HIV风险预测中的作用。
Front Reprod Health. 2022 Dec 22;4:1062387. doi: 10.3389/frph.2022.1062387. eCollection 2022.
9
Deep Ensemble Machine Learning Framework for the Estimation of Concentrations.深度集成机器学习框架用于估算浓度。
Environ Health Perspect. 2022 Mar;130(3):37004. doi: 10.1289/EHP9752. Epub 2022 Mar 7.

本文引用的文献

1
AUC-Maximizing Ensembles through Metalearning.通过元学习实现AUC最大化的集成方法。
Int J Biostat. 2016 May 1;12(1):203-18. doi: 10.1515/ijb-2015-0035.
8
Super learner.超级学习者。
Stat Appl Genet Mol Biol. 2007;6:Article25. doi: 10.2202/1544-6115.1309. Epub 2007 Sep 16.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验