IEEE Trans Med Imaging. 2018 Dec;37(12):2572-2581. doi: 10.1109/TMI.2018.2842767. Epub 2018 Jun 1.
To realize the full potential of deep learning for medical imaging, large annotated datasets are required for training. Such datasets are difficult to acquire due to privacy issues, lack of experts available for annotation, underrepresentation of rare conditions, and poor standardization. The lack of annotated data has been addressed in conventional vision applications using synthetic images refined via unsupervised adversarial training to look like real images. However, this approach is difficult to extend to general medical imaging because of the complex and diverse set of features found in real human tissues. We propose a novel framework that uses a reverse flow, where adversarial training is used to make real medical images more like synthetic images, and clinically-relevant features are preserved via self-regularization. These domain-adapted synthetic-like images can then be accurately interpreted by networks trained on large datasets of synthetic medical images. We implement this approach on the notoriously difficult task of depth-estimation from monocular endoscopy which has a variety of applications in colonoscopy, robotic surgery, and invasive endoscopic procedures. We train a depth estimator on a large data set of synthetic images generated using an accurate forward model of an endoscope and an anatomically-realistic colon. Our analysis demonstrates that the structural similarity of endoscopy depth estimation in a real pig colon predicted from a network trained solely on synthetic data improved by 78.7% by using reverse domain adaptation.
为了充分发挥深度学习在医学成像中的潜力,需要有大量标注的数据集来进行训练。由于隐私问题、缺乏可用于标注的专家、罕见情况的代表性不足以及标准化程度较差,此类数据集很难获取。在传统的视觉应用中,已经通过无监督对抗训练来解决缺乏标注数据的问题,这种方法可以对合成图像进行细化,使其看起来像真实图像。然而,由于真实人体组织中存在复杂多样的特征,这种方法很难扩展到一般的医学成像中。我们提出了一种新颖的框架,该框架使用反向流,通过对抗训练使真实医学图像更像合成图像,并通过自正则化保留临床相关特征。然后,可以通过在大量合成医学图像数据集上训练的网络准确地解释这些适应域的类似合成图像。我们在从单目内窥镜进行深度估计这一具有挑战性的任务上实现了这种方法,该任务在结肠镜检查、机器人手术和有创内窥镜手术中有多种应用。我们使用内窥镜的精确正向模型和解剖逼真的结肠生成了大量合成图像数据集,并在该数据集上训练了一个深度估计器。我们的分析表明,通过使用反向域自适应,仅从合成数据训练的网络预测真实猪结肠内窥镜深度估计的结构相似性提高了 78.7%。