Leng Tuo, Zhao Qingyu, Yang Chao, Lu Zhufu, Adeli Ehsan, Pohl Kilian M
School of Computer Engineering and Sciences, Shanghai University, Shanghai, China.
School of Medicine, Stanford University, Stanford, CA, USA.
Large Scale Annot Biomed Data Export Label Synth Hardw Aware Learn Med Imaging Comput Assist Interv (2019). 2019 Oct;11851:32-41. doi: 10.1007/978-3-030-33642-4_4. Epub 2019 Oct 24.
Due to difficulties in collecting sufficient training data, recent advances in neural-network-based methods have not been fully explored in the analysis of brain Magnetic Resonance Imaging (MRI). A possible solution to the limited-data issue is to augment the training set with synthetically generated data. In this paper, we propose a data augmentation strategy based on . We demonstrate the advantages of this strategy with respect to training a simple neural-network-based classifier in predicting when individual youth transition from no-to-low to medium-to-heavy alcohol drinkers solely based on their volumetric MRI measurements. Based on 20-fold cross-validation, we generate more than one million synthetic samples from less than 500 subjects for each training run. The classifier achieves an accuracy of 74.1% in correctly distinguishing non-drinkers from drinkers at baseline and a 43.2% weighted accuracy in predicting the transition over a three year period (5-group classification task). Both accuracy scores are significantly better than training the classifier on the original dataset.
由于收集足够的训练数据存在困难,基于神经网络的方法在脑磁共振成像(MRI)分析中的最新进展尚未得到充分探索。解决数据有限问题的一种可能方法是用合成生成的数据扩充训练集。在本文中,我们提出了一种基于……的数据扩充策略。我们展示了该策略在训练一个简单的基于神经网络的分类器时的优势,该分类器仅根据个体的体积MRI测量结果预测青少年何时从不饮酒或少量饮酒转变为中度至重度饮酒者。基于20折交叉验证,我们在每次训练运行中从不到500名受试者中生成了超过100万个合成样本。该分类器在基线时正确区分非饮酒者和饮酒者的准确率为74.1%,在预测三年期间的转变(5组分类任务)时加权准确率为43.2%。这两个准确率得分均显著优于在原始数据集上训练分类器的情况。