Liu Yi-Shao, Thaliffdeen Ryan, Han Sola, Park Chanhyun
College of Pharmacy, The University of Texas at Austin, 2409 University Ave, Austin, TX, USA.
Expert Rev Pharmacoecon Outcomes Res. 2023 Jul-Dec;23(7):761-771. doi: 10.1080/14737167.2023.2224963. Epub 2023 Jun 19.
The objective of this systematic review is to summarize the use of machine learning (ML) in predicting overall survival (OS) in patients with bladder cancer.
Search terms for bladder cancer, ML algorithms, and mortality were used to identify studies in PubMed and Web of Science as of February 2022. Notable inclusion/exclusion criteria contained the inclusion of studies that utilized patient-level datasets and exclusion of primary gene expression-related dataset studies. Study quality and bias were assessed using the International Journal of Medical Informatics (IJMEDI) checklist.
Of the 14 included studies, the most common algorithms were artificial neural networks ( = 8) and logistic regression ( = 4). Nine articles described missing data handling, with five articles removing patients with missing data entirely. With respect to feature selection, the most common sociodemographic variables were age ( = 9), gender ( = 9), and smoking status ( = 3), with clinical variables most commonly including tumor stage ( = 8), grade ( = 7), and lymph node involvement ( = 6). Most studies ( = 10) were of medium IJMEDI quality, with common areas of improvement being the descriptions of data preparation and deployment.
ML holds promise for optimizing bladder cancer care through accurate OS predictions, but challenges related to data processing, feature selection, and data source quality must be resolved to develop robust models. While this review is limited by its inability to compare models across studies, this systematic review will inform decision-making by various stakeholders to improve understanding of ML-based OS prediction in bladder cancer and foster interpretability of future models.
本系统评价的目的是总结机器学习(ML)在预测膀胱癌患者总生存期(OS)方面的应用。
使用膀胱癌、ML算法和死亡率的检索词,在截至2022年2月的PubMed和Web of Science中识别研究。显著的纳入/排除标准包括纳入使用患者水平数据集的研究,并排除原发性基因表达相关数据集研究。使用《国际医学信息学杂志》(IJMEDI)清单评估研究质量和偏倚。
在纳入的14项研究中,最常用的算法是人工神经网络(n = 8)和逻辑回归(n = 4)。9篇文章描述了缺失数据处理,其中5篇文章完全剔除了有缺失数据的患者。关于特征选择,最常见的社会人口统计学变量是年龄(n = 9)、性别(n = 9)和吸烟状况(n = 3),临床变量最常见的包括肿瘤分期(n = 8)、分级(n = 7)和淋巴结受累情况(n = 6)。大多数研究(n = 10)的IJMEDI质量中等,常见的改进领域是数据准备和部署的描述。
ML有望通过准确的OS预测优化膀胱癌治疗,但必须解决与数据处理、特征选择和数据源质量相关的挑战,以开发强大的模型。虽然本评价因无法跨研究比较模型而受到限制,但本系统评价将为各利益相关方的决策提供参考,以增进对基于ML的膀胱癌OS预测的理解,并促进未来模型的可解释性。