Abdallah Mohamed S
Department of Quantitative Techniques, Faculty of Commerce, Aswan University, Egypt.
Stat Methods Med Res. 2023 Jun;32(6):1217-1233. doi: 10.1177/09622802231167434. Epub 2023 Apr 10.
Receiver operating characteristic is a beneficial technique for evaluating the performance of a binary classification. The area under the curve of the receiver operating characteristic is an effective index of the accuracy of the classification process. While nonparametric point estimation has been well-studied under the ranked set sampling, it has received little attention under ranked set sampling variations. In order to set out to fill this gap, this article deals with the problem of estimating the area under the curve of the receiver operating characteristic based on paired ranked set sampling. New estimators of the area under the curve of the receiver operating characteristic based on paired ranked set sampling are proposed. Using the information supported by the concomitant variable, the additional area under the curve of the receiver operating characteristic estimators based on ranked set sampling as well as paired ranked set sampling are also introduced. It is shown either theoretically or numerically that the proposed estimators are consistent under the perfectness situation. It emerges that the concomitant-based estimators are shown to be superior to their competitors provided that the perfect assumption is not sharply violated. In contrast, kernel-based estimators are significantly superior relative to their rivals regardless of the quality of ranking. Finally, the application of the proposed procedures is also demonstrated by using empirical datasets in the context of medicine.
受试者工作特征曲线是评估二元分类性能的一种有益技术。受试者工作特征曲线下的面积是分类过程准确性的有效指标。虽然非参数点估计在排序集抽样下已得到充分研究,但在排序集抽样变体下却很少受到关注。为了填补这一空白,本文研究了基于配对排序集抽样估计受试者工作特征曲线下面积的问题。提出了基于配对排序集抽样的受试者工作特征曲线下面积的新估计量。利用伴随变量支持的信息,还引入了基于排序集抽样以及配对排序集抽样的受试者工作特征曲线下面积估计量的附加估计量。理论上或数值上都表明,在完美情况下,所提出的估计量是一致的。结果表明,只要完美假设没有被严重违反,基于伴随变量的估计量就优于其竞争对手。相比之下,无论排序质量如何,基于核的估计量相对于其竞争对手都具有显著优势。最后,还通过医学背景下的经验数据集展示了所提出程序的应用。