Suppr超能文献

一种使用具有多个伴随因素的部分排序集样本估计恶性乳腺癌患病率的改进方法。

An improved procedure for estimation of malignant breast cancer prevalence using partially rank ordered set samples with multiple concomitants.

作者信息

Hatefi Armin, Jafari Jozani Mohammad

机构信息

1 Department of Statistical Sciences, University of Toronto and The Fields Institute for Research in Mathematical Sciences, Toronto, Canada.

2 Department of Statistics, University of Manitoba, Winnipeg, Canada.

出版信息

Stat Methods Med Res. 2017 Dec;26(6):2552-2566. doi: 10.1177/0962280215601458. Epub 2015 Aug 26.

Abstract

Rank-based sampling designs are widely used in situations where measuring the variable of interest is costly but a small number of sampling units (set) can be easily ranked prior to taking the final measurements on them and this can be done at little cost. When the variable of interest is binary, a common approach for ranking the sampling units is to estimate the probabilities of success through a logistic regression model. However, this requires training samples for model fitting. Also, in this approach once a sampling unit has been measured, the extra rank information obtained in the ranking process is not used further in the estimation process. To address these issues, in this paper, we propose to use the partially rank-ordered set sampling design with multiple concomitants. In this approach, instead of fitting a logistic regression model, a soft ranking technique is employed to obtain a vector of weights for each measured unit that represents the probability or the degree of belief associated with its rank among a small set of sampling units. We construct an estimator which combines the rank information and the observed partially rank-ordered set measurements themselves. The proposed methodology is applied to a breast cancer study to estimate the proportion of patients with malignant (cancerous) breast tumours in a given population. Through extensive numerical studies, the performance of the estimator is evaluated under various concomitants with different ranking potentials (i.e. good, intermediate and bad) and tie structures among the ranks. We show that the precision of the partially rank-ordered set estimator is better than its counterparts under simple random sampling and ranked set sampling designs and, hence, the sample size required to achieve a desired precision is reduced.

摘要

基于秩的抽样设计在以下情形中被广泛使用

测量感兴趣的变量成本高昂,但少量抽样单元(集合)在对其进行最终测量之前能够轻易地进行排序,并且这样做成本很低。当感兴趣的变量是二元变量时,对抽样单元进行排序的一种常见方法是通过逻辑回归模型估计成功的概率。然而,这需要用于模型拟合的训练样本。此外,在这种方法中,一旦对一个抽样单元进行了测量,在排序过程中获得的额外秩信息在估计过程中就不再进一步使用。为了解决这些问题,在本文中,我们建议使用带有多个伴随变量的部分秩排序集抽样设计。在这种方法中,不是拟合逻辑回归模型,而是采用一种软排序技术来为每个测量单元获得一个权重向量,该向量表示与其在一小组抽样单元中的秩相关的概率或置信度。我们构建了一个结合秩信息和观测到的部分秩排序集测量值本身的估计量。所提出的方法应用于一项乳腺癌研究,以估计给定人群中患有恶性(癌性)乳腺肿瘤患者的比例。通过广泛的数值研究,在具有不同排序潜力(即好、中、差)的各种伴随变量以及秩之间的平局结构下评估了估计量的性能。我们表明,部分秩排序集估计量的精度优于简单随机抽样和秩排序集抽样设计下的对应估计量,因此,实现所需精度所需的样本量减少了。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验