IDEAS Research Institute, Robert Gordon University, St. Andrew Street, Aberdeen AB25 1HG, UK.
Artif Intell Med. 2012 May;55(1):25-35. doi: 10.1016/j.artmed.2011.11.003. Epub 2011 Dec 27.
Prediction of prostate cancer pathological stage is an essential step in a patient's pathway. It determines the treatment that will be applied further. In current practice, urologists use the pathological stage predictions provided in Partin tables to support their decisions. However, Partin tables are based on logistic regression (LR) and built from US data. Our objective is to investigate a range of both predictive methods and of predictive variables for pathological stage prediction and assess them with respect to their predictive quality based on U.K. data.
The latest version of Partin tables was applied to a large scale British dataset in order to measure their performances by mean of concordance index (c-index). The data was collected by the British Association of Urological Surgeons (BAUS) and gathered records from over 1700 patients treated with prostatectomy in 57 centers across UK. The original methodology was replicated using the BAUS dataset and evaluated using concordance index. In addition, a selection of classifiers, including, among others, LR, artificial neural networks and Bayesian networks (BNs) was applied to the same data and compared with each other using the area under the ROC curve (AUC). Subsets of the data were created in order to observe how classifiers perform with the inclusion of extra variables. Finally a local dataset prepared by the Aberdeen Royal Infirmary was used to study the effect on predictive performance of using different variables.
Partin tables have low predictive quality (c-index=0.602) when applied on UK data for comparison on patients with organ confined and extra prostatic extension conditions, patients at the two most frequently observed pathological stages. The use of replicate lookup tables built from British data shows an improvement in the classification, but the overall predictive quality remains low (c-index=0.610). Comparing a range of classifiers shows that BNs generally outperform other methods. Using the four variables from Partin tables, naive Bayes is the best classifier for the prediction of each class label (AUC=0.662 for OC). When two additional variables are added, the results of LR (0.675), artificial neural networks (0.656) and BN methods (0.679) are overall improved. BNs show higher AUCs than the other methods when the number of variables raises
The predictive quality of Partin tables can be described as low to moderate on U.K. data. This means that following the predictions generated by Partin tables, many patients would received an inappropriate treatment, generally associated with a deterioration of their quality of life. In addition to demographic differences between U.K. and the original U.S. population, the methodology and in particular LR present limitations. BN represents a promising alternative to LR from which prostate cancer staging can benefit. Heuristic search for structure learning and the inclusion of more variables are elements that further improve BN models quality.
预测前列腺癌病理分期是患者治疗路径中的一个重要步骤。它决定了将采用的治疗方法。目前,泌尿科医生使用《Partin 表》中提供的病理分期预测来支持他们的决策。然而,《Partin 表》是基于逻辑回归(LR)并基于美国数据建立的。我们的目标是研究一系列用于病理分期预测的预测方法和预测变量,并根据英国数据评估它们的预测质量。
将最新版本的《Partin 表》应用于英国的大型数据集,通过一致性指数(c-index)来衡量它们的性能。该数据由英国泌尿科医师协会(BAUS)收集,来自英国 57 个中心的 1700 多名接受前列腺切除术治疗的患者记录。使用 BAUS 数据集复制了原始方法,并使用一致性指数进行了评估。此外,还应用了包括逻辑回归、人工神经网络和贝叶斯网络(BN)在内的一系列分类器,并使用 ROC 曲线下的面积(AUC)进行了比较。创建了数据集的子集,以便观察随着额外变量的纳入,分类器的表现如何。最后,使用阿伯丁皇家医院准备的本地数据集研究了使用不同变量对预测性能的影响。
当将《Partin 表》应用于英国数据以比较局限于器官和前列腺外延伸的患者和最常观察到的两个病理阶段的患者时,其预测质量(c-index=0.602)较低。使用从英国数据构建的重复查找表可提高分类效果,但整体预测质量仍然较低(c-index=0.610)。比较一系列分类器表明,BN 通常优于其他方法。使用《Partin 表》中的四个变量,朴素贝叶斯是预测每个类别标签的最佳分类器(OC 的 AUC=0.662)。当添加两个额外变量时,LR(0.675)、人工神经网络(0.656)和 BN 方法(0.679)的结果总体上得到了提高。当变量数量增加时,BN 的 AUC 高于其他方法。
《Partin 表》的预测质量在英国数据上可以说是低到中等。这意味着,根据《Partin 表》生成的预测,许多患者将接受不适当的治疗,通常会导致他们的生活质量恶化。除了英国和原始美国人群之间的人口统计学差异外,方法,特别是 LR 存在局限性。BN 是 LR 的一种有前途的替代方法,前列腺癌分期可以从中受益。结构学习的启发式搜索和更多变量的纳入是进一步提高 BN 模型质量的要素。