Chair for Computer Aided Medical Procedures, Fakultät für Informatik, Technische Universität München, Germany.
IBM Almaden Research Center, San Jose, CA, USA.
Med Image Anal. 2016 Aug;32:1-17. doi: 10.1016/j.media.2016.02.005. Epub 2016 Mar 17.
In this paper, we propose a supervised domain adaptation (DA) framework for adapting decision forests in the presence of distribution shift between training (source) and testing (target) domains, given few labeled examples. We introduce a novel method for DA through an error-correcting hierarchical transfer relaxation scheme with domain alignment, feature normalization, and leaf posterior reweighting to correct for the distribution shift between the domains. For the first time we apply DA to the challenging problem of extending in vitro trained forests (source domain) for in vivo applications (target domain). The proof-of-concept is provided for in vivo characterization of atherosclerotic tissues using intravascular ultrasound signals, where presence of flowing blood is a source of distribution shift between the two domains. This potentially leads to misclassification upon direct deployment of in vitro trained classifier, thus motivating the need for DA as obtaining reliable in vivo training labels is often challenging if not infeasible. Exhaustive validations and parameter sensitivity analysis substantiate the reliability of the proposed DA framework and demonstrates improved tissue characterization performance for scenarios where adaptation is conducted in presence of only a few examples. The proposed method can thus be leveraged to reduce annotation costs and improve computational efficiency over conventional retraining approaches.
在本文中,我们提出了一种有监督的领域自适应(DA)框架,用于在训练(源)和测试(目标)域之间存在分布偏移的情况下,适应决策森林,同时给定很少的标记示例。我们通过一种新颖的方法通过错误纠正分层转移松弛方案进行 DA,该方案具有域对齐、特征归一化和叶子后验重新加权,以纠正域之间的分布偏移。我们首次将 DA 应用于扩展体外训练的森林(源域)用于体内应用(目标域)的挑战性问题。使用血管内超声信号对动脉粥样硬化组织进行体内特征描述的概念验证,其中流动血液的存在是两个域之间分布偏移的来源。这可能导致直接部署体外训练的分类器时出现错误分类,因此需要进行 DA,因为如果不可行,则通常很难获得可靠的体内训练标签。详尽的验证和参数敏感性分析证实了所提出的 DA 框架的可靠性,并证明了在仅存在少数示例的情况下进行自适应时,该方法可以提高组织特征描述性能。因此,该方法可以用于减少注释成本并提高传统重新训练方法的计算效率。