Branscum Adam J, Johnson Wesley O, Hanson Timothy E, Baron Andre T
Biostatistics Program, Oregon State University, Corvallis, 97331, Oregon, U.S.A.
Department of Statistics, University of California, Irvine, CA, U.S.A.
Stat Med. 2015 Dec 30;34(30):3997-4015. doi: 10.1002/sim.6610. Epub 2015 Aug 3.
A novel semiparametric regression model is developed for evaluating the covariate-specific accuracy of a continuous medical test or biomarker. Ideally, studies designed to estimate or compare medical test accuracy will use a separate, flawless gold-standard procedure to determine the true disease status of sampled individuals. We treat this as a special case of the more complicated and increasingly common scenario in which disease status is unknown because a gold-standard procedure does not exist or is too costly or invasive for widespread use. To compensate for missing data on disease status, covariate information is used to discriminate between diseased and healthy units. We thus model the probability of disease as a function of 'disease covariates'. In addition, we model test/biomarker outcome data to depend on 'test covariates', which provides researchers the opportunity to quantify the impact of covariates on the accuracy of a medical test. We further model the distributions of test outcomes using flexible semiparametric classes. An important new theoretical result demonstrating model identifiability under mild conditions is presented. The modeling framework can be used to obtain inferences about covariate-specific test accuracy and the probability of disease based on subject-specific disease and test covariate information. The value of the model is illustrated using multiple simulation studies and data on the age-adjusted ability of soluble epidermal growth factor receptor - a ubiquitous serum protein - to serve as a biomarker of lung cancer in men. SAS code for fitting the model is provided. Copyright © 2015 John Wiley & Sons, Ltd.
开发了一种新型半参数回归模型,用于评估连续医学检验或生物标志物的协变量特定准确性。理想情况下,旨在估计或比较医学检验准确性的研究将使用单独的、完美无缺的金标准程序来确定抽样个体的真实疾病状态。我们将此视为更复杂且日益常见的情况的一种特殊情形,即由于不存在金标准程序,或者金标准程序成本过高或侵入性过大而无法广泛应用,导致疾病状态未知。为了弥补疾病状态数据的缺失,利用协变量信息来区分患病和健康个体。因此,我们将疾病概率建模为“疾病协变量”的函数。此外,我们对检验/生物标志物结果数据进行建模,使其依赖于“检验协变量”,这为研究人员提供了量化协变量对医学检验准确性影响的机会。我们进一步使用灵活的半参数类别对检验结果的分布进行建模。给出了一个重要的新理论结果,证明了在温和条件下模型的可识别性。该建模框架可用于基于个体特定的疾病和检验协变量信息,对协变量特定检验准确性和疾病概率进行推断。通过多项模拟研究以及关于可溶性表皮生长因子受体(一种普遍存在的血清蛋白)在男性中作为肺癌生物标志物的年龄调整能力的数据,说明了该模型的价值。提供了用于拟合该模型的SAS代码。版权所有© 2015约翰威立父子有限公司。