Division of Surgery and Interventional Science, University College London, Charles Bell House, 43-45 Foley Street, London W1W 7TY, UK; Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, Charles Bell House, 43-45 Foley Street, London W1W 7TY, UK.
Department of Computer Science, Jerusalem College of Technology, Havaad Haleumi 21, Givat Mordechai 91160 Jerusalem, Israel.
Clin Res Hepatol Gastroenterol. 2023 Mar;47(3):102087. doi: 10.1016/j.clinre.2023.102087. Epub 2023 Jan 18.
Oesophageal cancer is associated with poor health outcomes. Upper GI (UGI) endoscopy is the gold standard for diagnosis but is associated with patient discomfort and low yield for cancer. We used a machine learning approach to create a model which predicted oesophageal cancer based on questionnaire responses.
We used data from 2 separate prospective cross-sectional studies: the Saliva to Predict rIsk of disease using Transcriptomics and epigenetics (SPIT) study and predicting RIsk of diSease using detailed Questionnaires (RISQ) study. We recruited patients from National Health Service (NHS) suspected cancer pathways as well as patients with known cancer. We identified patient characteristics and questionnaire responses which were most associated with the development of oesophageal cancer. Using the SPIT dataset, we trained seven different machine learning models, selecting the best area under the receiver operator curve (AUC) to create our final model. We further applied a cost function to maximise cancer detection. We then independently validated the model using the RISQ dataset.
807 patients were included in model training and testing, split in a 70:30 ratio. 294 patients were included in model validation. The best model during training was regularised logistic regression using 17 features (median AUC: 0.81, interquartile range (IQR): 0.69-0.85). For testing and validation datasets, the model achieved an AUC of 0.71 (95% CI: 0.61-0.81) and 0.92 (95% CI: 0.88-0.96) respectively. At a set cut off, our model achieved a sensitivity of 97.6% and specificity of 59.1%. We additionally piloted the model in 12 patients with gastric cancer; 9/12 (75%) of patients were correctly classified.
We have developed and validated a risk stratification tool using a questionnaire approach. This could aid prioritising patients at high risk of having oesophageal cancer for endoscopy. Our tool could help address endoscopic backlogs caused by the COVID-19 pandemic.
食管癌与不良健康结果相关。上消化道(UGI)内镜检查是诊断的金标准,但与患者不适和癌症检出率低有关。我们使用机器学习方法创建了一个基于问卷回答预测食管癌的模型。
我们使用了两项独立的前瞻性横断面研究的数据:唾液预测转录组和表观遗传学风险的研究(SPIT)和使用详细问卷预测疾病风险的研究(RISQ)。我们从国民保健制度(NHS)疑似癌症途径以及已知癌症患者中招募患者。我们确定了与食管癌发展最相关的患者特征和问卷回答。使用 SPIT 数据集,我们训练了七种不同的机器学习模型,选择最佳的接收器操作特征曲线下面积(AUC)来创建我们的最终模型。我们进一步应用成本函数来最大化癌症检测。然后,我们使用 RISQ 数据集独立验证了该模型。
807 名患者被纳入模型训练和测试,分为 70:30 的比例。294 名患者被纳入模型验证。在训练过程中,表现最好的模型是使用 17 个特征的正则化逻辑回归(中位数 AUC:0.81,四分位距(IQR):0.69-0.85)。对于测试和验证数据集,该模型的 AUC 分别为 0.71(95%CI:0.61-0.81)和 0.92(95%CI:0.88-0.96)。在设定的截止值下,我们的模型达到了 97.6%的敏感度和 59.1%的特异性。我们还在 12 名胃癌患者中试用了该模型;12 名患者中有 9 名(75%)被正确分类。
我们使用问卷方法开发和验证了一种风险分层工具。这可以帮助优先考虑患有食管癌风险较高的患者进行内镜检查。我们的工具可以帮助解决因 COVID-19 大流行而导致的内镜检查积压问题。