Longato Enrico, Vettoretti Martina, Di Camillo Barbara
Department of Information Engineering, University of Padova, Padova, Italy.
J Biomed Inform. 2020 Aug;108:103496. doi: 10.1016/j.jbi.2020.103496. Epub 2020 Jul 9.
Developing a prognostic model for biomedical applications typically requires mapping an individual's set of covariates to a measure of the risk that he or she may experience the event to be predicted. Many scenarios, however, especially those involving adverse pathological outcomes, are better described by explicitly accounting for the timing of these events, as well as their probability. As a result, in these cases, traditional classification or ranking metrics may be inadequate to inform model evaluation or selection. To address this limitation, it is common practice to reframe the problem in the context of survival analysis, and resort, instead, to the concordance index (C-index), which summarises how well a predicted risk score describes an observed sequence of events. A practically meaningful interpretation of the C-index, however, may present several difficulties and pitfalls. Specifically, we identify two main issues: i) the C-index remains implicitly, and subtly, dependent on time, and ii) its relationship with the number of subjects whose risk was incorrectly predicted is not straightforward. Failure to consider these two aspects may introduce undesirable and unwanted biases in the evaluation process, and even result in the selection of a suboptimal model. Hence, here, we discuss ways to obtain a meaningful interpretation in spite of these difficulties. Aiming to assist experimenters regardless of their familiarity with the C-index, we start from an introductory-level presentation of its most popular estimator, highlighting the latter's temporal dependency, and suggesting how it might be correctly used to inform model selection. We also address the nonlinearity of the C-index with respect to the number of correct risk predictions, elaborating a simplified framework that may enable an easier interpretation and quantification of C-index improvements or deteriorations.
为生物医学应用开发一个预后模型通常需要将个体的协变量集映射到对其可能经历待预测事件的风险的一种度量。然而,在许多情况下,尤其是那些涉及不良病理结果的情况,通过明确考虑这些事件的发生时间及其概率能得到更好的描述。因此,在这些情况下,传统的分类或排序指标可能不足以用于模型评估或选择。为了解决这一局限性,通常的做法是在生存分析的背景下重新构建问题,并转而采用一致性指数(C指数),它总结了预测风险评分对观察到的事件序列的描述程度。然而,对C指数进行实际有意义的解释可能会存在一些困难和陷阱。具体来说,我们识别出两个主要问题:i)C指数仍然隐含且微妙地依赖于时间,ii)它与风险被错误预测的受试者数量之间的关系并不直接。不考虑这两个方面可能会在评估过程中引入不良且不必要的偏差,甚至导致选择次优模型。因此,在这里,我们讨论尽管存在这些困难仍能获得有意义解释的方法。为了帮助实验人员,无论他们对C指数是否熟悉,我们从对其最流行估计器的入门级介绍开始,强调后者的时间依赖性,并建议如何正确使用它来指导模型选择。我们还讨论了C指数相对于正确风险预测数量的非线性,阐述了一个简化框架,该框架可能使对C指数的改善或恶化进行更轻松的解释和量化。