Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK.
Great Ormond Street Institute of Child Health, University College London, London, UK.
Magn Reson Med. 2022 Feb;87(2):932-947. doi: 10.1002/mrm.29014. Epub 2021 Sep 21.
Supervised machine learning (ML) provides a compelling alternative to traditional model fitting for parameter mapping in quantitative MRI. The aim of this work is to demonstrate and quantify the effect of different training data distributions on the accuracy and precision of parameter estimates when supervised ML is used for fitting.
We fit a two- and three-compartment biophysical model to diffusion measurements from in-vivo human brain, as well as simulated diffusion data, using both traditional model fitting and supervised ML. For supervised ML, we train several artificial neural networks, as well as random forest regressors, on different distributions of ground truth parameters. We compare the accuracy and precision of parameter estimates obtained from the different estimation approaches using synthetic test data.
When the distribution of parameter combinations in the training set matches those observed in healthy human data sets, we observe high precision, but inaccurate estimates for atypical parameter combinations. In contrast, when training data is sampled uniformly from the entire plausible parameter space, estimates tend to be more accurate for atypical parameter combinations but may have lower precision for typical parameter combinations.
This work highlights that estimation of model parameters using supervised ML depends strongly on the training-set distribution. We show that high precision obtained using ML may mask strong bias, and visual assessment of the parameter maps is not sufficient for evaluating the quality of the estimates.
监督机器学习(ML)为定量 MRI 中的参数映射提供了一种比传统模型拟合更具吸引力的替代方法。本研究旨在展示和量化在使用监督 ML 进行拟合时,不同训练数据分布对参数估计准确性和精度的影响。
我们使用传统模型拟合和监督 ML 对来自体内人脑的扩散测量值以及模拟的扩散数据进行了两室和三室生物物理模型拟合。对于监督 ML,我们在不同的真实参数分布上训练了几个人工神经网络和随机森林回归器。我们使用合成测试数据比较了不同估计方法得到的参数估计的准确性和精度。
当训练集中的参数组合分布与健康人类数据集观察到的分布相匹配时,我们观察到高精度,但对异常参数组合的估计不准确。相比之下,当训练数据从整个可能的参数空间均匀采样时,对于异常参数组合,估计往往更准确,但对于典型参数组合,精度可能较低。
这项工作强调了使用监督 ML 进行模型参数估计强烈依赖于训练集的分布。我们表明,使用 ML 获得的高精度可能掩盖了强烈的偏差,并且对参数图的视觉评估不足以评估估计的质量。