IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):7983-7997. doi: 10.1109/TPAMI.2024.3399098. Epub 2024 Nov 6.
In the field of healthcare, the acquisition of sample is usually restricted by multiple considerations, including cost, labor- intensive annotation, privacy concerns, and radiation hazards, therefore, synthesizing images-of-interest is an important tool to data augmentation. Diffusion models have recently attained state-of-the-art results in various synthesis tasks, and embedding energy functions has been proved that can effectively guide the pre-trained model to synthesize target samples. However, we notice that current method development and validation are still limited to improving indicators, such as Fréchet Inception Distance score (FID) and Inception Score (IS), and have not provided deeper investigations on downstream tasks, like disease grading and diagnosis. Moreover, existing classifier guidance which can be regarded as a special case of energy function can only has a singular effect on altering the distribution of the synthetic dataset. This may contribute to in-distribution synthetic sample that has limited help to downstream model optimization. All these limitations remind that we still have a long way to go to achieve controllable generation. In this work, we first conducted an analysis on previous guidance as well as its contributions on further applications from the perspective of data distribution. To synthesize samples which can help downstream applications, we then introduce uncertainty guidance in each sampling step and design an uncertainty-guided diffusion models. Extensive experiments on four medical datasets, with ten classic networks trained on the augmented sample sets provided a comprehensive evaluation on the practical contributions of our methodology. Furthermore, we provide a theoretical guarantee for general gradient guidance in diffusion models, which would benefit future research on investigating other forms of measurement guidance for specific generative tasks.
在医疗保健领域,样本的获取通常受到多种因素的限制,包括成本、劳动密集型注释、隐私问题和辐射危害等。因此,合成感兴趣的图像是数据增强的重要工具。扩散模型最近在各种合成任务中取得了最先进的结果,并且已经证明嵌入能量函数可以有效地指导预训练模型合成目标样本。然而,我们注意到当前的方法开发和验证仍然局限于提高指标,如 Fréchet Inception Distance 得分 (FID) 和 Inception Score (IS),而没有对下游任务(如疾病分级和诊断)进行更深入的研究。此外,现有的分类器指导(可以看作是能量函数的一个特例)只能对改变合成数据集的分布产生单一的影响。这可能导致在分布内的合成样本对下游模型优化的帮助有限。所有这些局限性提醒我们,要实现可控生成,还有很长的路要走。在这项工作中,我们首先从数据分布的角度分析了先前的指导及其对进一步应用的贡献。为了合成有助于下游应用的样本,我们在每个采样步骤中引入不确定性指导,并设计了不确定性引导的扩散模型。在四个医疗数据集上进行了广泛的实验,使用十种经典网络在增强的样本集上进行训练,对我们的方法的实际贡献进行了全面评估。此外,我们为扩散模型中的一般梯度指导提供了理论保证,这将有助于未来对特定生成任务的其他形式的度量指导进行研究。