Huang Ying, Fong Youyi, Wei John, Feng Ziding
Fred Hutchinson Cancer Research Center, Vaccine & Infectious Disease / Public Health Sciences, 1100 Fairview Avenue N., Seattle, WA 98109, USA.
J R Stat Soc Ser C Appl Stat. 2011 Nov 1;60(5):633-653. doi: 10.1111/j.1467-9876.2011.00761.x.
A marker's capacity to predict risk of a disease depends on disease prevalence in the target population and its classification accuracy, i.e. its ability to discriminate diseased subjects from non-diseased subjects. The latter is often considered an intrinsic property of the marker; it is independent of disease prevalence and hence more likely to be similar across populations than risk prediction measures. In this paper, we are interested in evaluating the population-specific performance of a risk prediction marker in terms of positive predictive value (PPV) and negative predictive value (NPV) at given thresholds, when samples are available from the target population as well as from another population. A default strategy is to estimate PPV and NPV using samples from the target population only. However, when the marker's classification accuracy as characterized by a specific point on the receiver operating characteristics (ROC) curve is similar across populations, borrowing information across populations allows increased efficiency in estimating PPV and NPV. We develop estimators that optimally combine information across populations. We apply this methodology to a cross-sectional study where we evaluate PCA3 as a risk prediction marker for prostate cancer among subjects with or without previous negative biopsy.
一种标志物预测疾病风险的能力取决于目标人群中的疾病患病率及其分类准确性,即其区分患病个体与未患病个体的能力。后者通常被视为该标志物的固有属性;它独立于疾病患病率,因此与风险预测指标相比,在不同人群中更可能相似。在本文中,当有来自目标人群以及另一个人群的样本时,我们感兴趣的是在给定阈值下,根据阳性预测值(PPV)和阴性预测值(NPV)评估风险预测标志物在特定人群中的表现。一种默认策略是仅使用来自目标人群的样本估计PPV和NPV。然而,当标志物的分类准确性(由受试者工作特征曲线(ROC)上的特定点表征)在不同人群中相似时,跨人群借用信息可提高估计PPV和NPV的效率。我们开发了能最优地整合跨人群信息的估计方法。我们将此方法应用于一项横断面研究,在该研究中我们评估PCA3作为有或无既往阴性活检的受试者中前列腺癌的风险预测标志物。