Mathematics Department, Mathematical Oncology Laboratory (MôLAB), Universidad de Castilla-La Mancha, Ciudad Real, Spain.
Helmholtz-Zentrum Berlin, Berlin, Germany.
Sci Rep. 2019 Apr 12;9(1):5982. doi: 10.1038/s41598-019-42326-3.
Many studies have built machine-learning (ML)-based prognostic models for glioblastoma (GBM) based on radiological features. We wished to compare the predictive performance of these methods to human knowledge-based approaches. 404 GBM patients were included (311 discovery and 93 validation). 16 morphological and 28 textural descriptors were obtained from pretreatment volumetric postcontrast T1-weighted magnetic resonance images. Different prognostic ML methods were developed. An optimized linear prognostic model (OLPM) was also built using the four significant non-correlated parameters with individual prognosis value. OLPM achieved high prognostic value (validation c-index = 0.817) and outperformed ML models based on either the same parameter set or on the full set of 44 attributes considered. Neural networks with cross-validation-optimized attribute selection achieved comparable results (validation c-index = 0.825). ML models using only the four outstanding parameters obtained better results than their counterparts based on all the attributes, which presented overfitting. In conclusion, OLPM and ML methods studied here provided the most accurate survival predictors for glioblastoma to date, due to a combination of the strength of the methodology, the quality and volume of the data used and the careful attribute selection. The ML methods studied suffered overfitting and lost prognostic value when the number of parameters was increased.
许多研究基于放射学特征构建了用于胶质母细胞瘤(GBM)的基于机器学习(ML)的预后模型。我们希望将这些方法的预测性能与基于人类知识的方法进行比较。共纳入 404 名 GBM 患者(311 名发现队列,93 名验证队列)。从预处理容积对比后 T1 加权磁共振图像中获得了 16 个形态学和 28 个纹理描述符。开发了不同的预后 ML 方法。还使用具有个体预后价值的四个显著非相关参数构建了优化线性预后模型(OLPM)。OLPM 具有较高的预后价值(验证 c 指数= 0.817),优于基于相同参数集或考虑的 44 个属性的全部的 ML 模型。具有交叉验证优化属性选择的神经网络也取得了可比的结果(验证 c 指数= 0.825)。仅使用四个突出参数的 ML 模型的结果优于基于所有属性的对应模型,后者存在过拟合。总之,OLPM 和这里研究的 ML 方法由于方法学的优势、使用的数据的质量和数量以及仔细的属性选择,为胶质母细胞瘤提供了迄今为止最准确的生存预测因子。研究中的 ML 方法在增加参数数量时会出现过拟合并失去预后价值。