AppendiXNet:利用视频预训练对小数据集 CT 检查进行阑尾炎诊断的深度学习方法。
AppendiXNet: Deep Learning for Diagnosis of Appendicitis from A Small Dataset of CT Exams Using Video Pretraining.
机构信息
Stanford University Department of Computer Science, Stanford, USA.
Stanford University Department of Radiology, Stanford, USA.
出版信息
Sci Rep. 2020 Mar 3;10(1):3958. doi: 10.1038/s41598-020-61055-6.
The development of deep learning algorithms for complex tasks in digital medicine has relied on the availability of large labeled training datasets, usually containing hundreds of thousands of examples. The purpose of this study was to develop a 3D deep learning model, AppendiXNet, to detect appendicitis, one of the most common life-threatening abdominal emergencies, using a small training dataset of less than 500 training CT exams. We explored whether pretraining the model on a large collection of natural videos would improve the performance of the model over training the model from scratch. AppendiXNet was pretrained on a large collection of YouTube videos called Kinetics, consisting of approximately 500,000 video clips and annotated for one of 600 human action classes, and then fine-tuned on a small dataset of 438 CT scans annotated for appendicitis. We found that pretraining the 3D model on natural videos significantly improved the performance of the model from an AUC of 0.724 (95% CI 0.625, 0.823) to 0.810 (95% CI 0.725, 0.895). The application of deep learning to detect abnormalities on CT examinations using video pretraining could generalize effectively to other challenging cross-sectional medical imaging tasks when training data is limited.
深度学习算法在数字医学中的复杂任务的发展依赖于大量标记训练数据集的可用性,通常包含数十万例。本研究的目的是开发一种 3D 深度学习模型 AppendiXNet,使用少于 500 次训练 CT 检查的小型训练数据集来检测阑尾炎,这是最常见的危及生命的腹部急症之一。我们探讨了在大型自然视频集合上预训练模型是否会提高模型的性能,而不是从头开始训练模型。AppendiXNet 是在一个名为 Kinetics 的大型 YouTube 视频集合上进行预训练的,该集合包含大约 50 万个视频剪辑,并标注了 600 个人类动作类别之一,然后在 438 个标注为阑尾炎的 CT 扫描的小型数据集上进行微调。我们发现,在自然视频上预训练 3D 模型可显著提高模型的性能,从 AUC 为 0.724(95%CI 0.625,0.823)提高到 0.810(95%CI 0.725,0.895)。使用视频预训练在 CT 检查上检测异常的深度学习应用可以在训练数据有限的情况下有效地推广到其他具有挑战性的医学横截面成像任务。