Suppr超能文献

基于数据相关性的ROC分析中自举法变异性的蒙特卡罗研究。

Monte Carlo studies of bootstrap variability in ROC analysis with data dependency.

作者信息

Wu Jin Chu, Martin Alvin F, Kacker Raghu N

机构信息

National Institute of Standards and Technology, Gaithersburg, Maryland 20899, USA.

出版信息

Commun Stat Simul Comput. 2018;48(2). doi: 10.1080/03610918.2018.1521974.

Abstract

ROC analysis involving two large datasets is an important method for analyzing statistics of interest for decision making of a classifier in many disciplines. And data dependency due to multiple use of the same subjects exists ubiquitously in order to generate more samples because of limited resources. Hence, a two-layer data structure is constructed and the nonparametric two-sample two-layer bootstrap is employed to estimate standard errors of statistics of interest derived from two sets of data, such as a weighted sum of two probabilities. In this article, to reduce the bootstrap variance and ensure the accuracy of computation, Monte Carlo studies of bootstrap variability were carried out to determine the appropriate number of bootstrap replications in ROC analysis with data dependency. It is suggested that with a tolerance 0.02 of the coefficient of variation, 2,000 bootstrap replications be appropriate under such circumstances.

摘要

涉及两个大型数据集的ROC分析是许多学科中分析分类器决策相关统计量的重要方法。由于资源有限,为了生成更多样本,同一受试者多次使用导致的数据依赖性普遍存在。因此,构建了两层数据结构,并采用非参数双样本两层自助法来估计从两组数据中得出的感兴趣统计量的标准误差,例如两个概率的加权和。在本文中,为了减少自助法方差并确保计算准确性,进行了自助法变异性的蒙特卡罗研究,以确定在存在数据依赖性的ROC分析中自助重复抽样的合适次数。建议在变异系数容忍度为0.02的情况下,这种情况下2000次自助重复抽样是合适的。

相似文献

1
Monte Carlo studies of bootstrap variability in ROC analysis with data dependency.
Commun Stat Simul Comput. 2018;48(2). doi: 10.1080/03610918.2018.1521974.
2
The Impact of Data Dependence on Speaker Recognition Evaluation.
IEEE/ACM Trans Audio Speech Lang Process. 2017 Jan;25(1):5-18. doi: 10.1109/TASLP.2016.2614725. Epub 2016 Sep 30.
5
The use of bootstrapping when using propensity-score matching without replacement: a simulation study.
Stat Med. 2014 Oct 30;33(24):4306-19. doi: 10.1002/sim.6276. Epub 2014 Aug 4.
6
Validation of Nonparametric Two-Sample Bootstrap in ROC Analysis on Large Datasets.
Commun Stat Simul Comput. 2016;45(5):1689-1703. doi: 10.1080/03610918.2015.1065327. Epub 2015 Aug 31.
8
9
A bootstrap test for the analysis of microarray experiments with a very small number of replications.
Appl Bioinformatics. 2006;5(3):173-9. doi: 10.2165/00822942-200605030-00005.
10
Confidence intervals for the receiver operating characteristic area in studies with small samples.
Acad Radiol. 1998 Aug;5(8):561-71. doi: 10.1016/s1076-6332(98)80208-0.

引用本文的文献

本文引用的文献

1
The Impact of Data Dependence on Speaker Recognition Evaluation.
IEEE/ACM Trans Audio Speech Lang Process. 2017 Jan;25(1):5-18. doi: 10.1109/TASLP.2016.2614725. Epub 2016 Sep 30.
2
A novel measure and significance testing in data analysis of cell image segmentation.
BMC Bioinformatics. 2017 Mar 14;18(1):168. doi: 10.1186/s12859-017-1527-x.
3
Measures, Uncertainties, and Significance Test in Operational ROC Analysis.
J Res Natl Inst Stand Technol. 2011 Feb 1;116(1):517-37. doi: 10.6028/jres.116.003. Print 2011 Jan-Feb.
4
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
Radiology. 1983 Sep;148(3):839-43. doi: 10.1148/radiology.148.3.6878708.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验