Centre for Cancer Prevention, Wolfson Institute of Preventive Medicine, Queen Mary University of London, London, UK.
School of Physics, Astronomy and Mathematics, University of Hertfordshire, Hatfield, UK.
Br J Cancer. 2020 Mar;122(5):692-696. doi: 10.1038/s41416-019-0694-0. Epub 2019 Dec 20.
An accurate and simple risk prediction model that would facilitate earlier detection of pancreatic adenocarcinoma (PDAC) is not available at present. In this study, we compare different algorithms of risk prediction in order to select the best one for constructing a biomarker-based risk score, PancRISK.
Three hundred and seventy-nine patients with available measurements of three urine biomarkers, (LYVE1, REG1B and TFF1) using retrospectively collected samples, as well as creatinine and age, were randomly split into training and validation sets, following stratification into cases (PDAC) and controls (healthy patients). Several machine learning algorithms were used, and their performance characteristics were compared. The latter included AUC (area under ROC curve) and sensitivity at clinically relevant specificity.
None of the algorithms significantly outperformed all others. A logistic regression model, the easiest to interpret, was incorporated into a PancRISK score and subsequently evaluated on the whole data set. The PancRISK performance could be even further improved when CA19-9, commonly used PDAC biomarker, is added to the model.
PancRISK score enables easy interpretation of the biomarker panel data and is currently being tested to confirm that it can be used for stratification of patients at risk of developing pancreatic cancer completely non-invasively, using urine samples.
目前尚无能够准确且简便地预测胰腺腺癌(PDAC)风险的模型。本研究旨在比较不同风险预测算法,以选择最佳算法构建基于生物标志物的风险评分 PancRISK。
回顾性收集了 379 例患者的 3 种尿液生物标志物(LYVE1、REG1B 和 TFF1)的测量值,以及肌酐和年龄,将其随机分为训练集和验证集,并根据病例(PDAC)和对照组(健康患者)进行分层。使用了多种机器学习算法,并比较了它们的性能特征。后者包括 AUC(ROC 曲线下面积)和在临床相关特异性下的灵敏度。
没有一种算法明显优于其他算法。选择最容易解释的逻辑回归模型纳入 PancRISK 评分,并在整个数据集上进行评估。当将常用的 PDAC 生物标志物 CA19-9 添加到模型中时,PancRISK 的性能可以进一步提高。
PancRISK 评分能够方便地解释生物标志物组数据,目前正在进行测试,以确认它是否可以使用尿液样本完全无创地对有发生胰腺癌风险的患者进行分层。