Suppr超能文献

评估基于调查数据估计的二项分类器。

Evaluating Binary Outcome Classifiers Estimated from Survey Data.

机构信息

From the Department of Statistical Science, Duke University, Durham, NC.

出版信息

Epidemiology. 2024 Nov 1;35(6):805-812. doi: 10.1097/EDE.0000000000001776. Epub 2024 Aug 14.

Abstract

Surveys are commonly used to facilitate research in epidemiology, health, and the social and behavioral sciences. Often, these surveys are not simple random samples, and respondents are given weights reflecting their probability of selection into the survey. We show that using survey weights can be beneficial for evaluating the quality of predictive models when splitting data into training and test sets. In particular, we characterize model assessment statistics, such as sensitivity and specificity, as finite population quantities and compute survey-weighted estimates of these quantities with test data comprising a random subset of the original data. Using simulations with data from the National Survey on Drug Use and Health and the National Comorbidity Survey, we show that unweighted metrics estimated with sample test data can misrepresent population performance, but weighted metrics appropriately adjust for the complex sampling design. We also show that this conclusion holds for models trained using upsampling for mitigating class imbalance. The results suggest that weighted metrics should be used when evaluating performance on test data derived from complex surveys.

摘要

调查通常用于促进流行病学、健康、社会和行为科学领域的研究。通常情况下,这些调查不是简单的随机样本,被调查者会被赋予反映其被选中参与调查的概率的权重。我们表明,在将数据分割为训练集和测试集时,使用调查权重可以有益于评估预测模型的质量。具体来说,我们将模型评估统计量(如灵敏度和特异性)表示为有限总体数量,并使用测试数据(由原始数据的随机子集组成)计算这些数量的调查加权估计值。使用来自国家药物使用和健康调查以及国家共病调查的数据进行模拟,我们表明,使用样本测试数据估计的未加权指标可能会对人口表现产生误导,但加权指标可以适当地调整复杂的抽样设计。我们还表明,对于使用上采样来减轻类别不平衡的模型,这一结论也成立。结果表明,当评估源自复杂调查的测试数据的性能时,应该使用加权指标。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验