Mao Lingchao, Wang Lujia, Hu Leland S, Eschbacher Jenny M, Leon Gustavo De, Singleton Kyle W, Curtin Lee A, Urcuyo Javier, Sereduk Chris, Tran Nhan L, Hawkins-Daarud Andrea, Swanson Kristin R, Li Jing
H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA.
Department of Radiology at Mayo Clinic Arizona, Phoenix, AZ 85054 USA.
IEEE Trans Autom Sci Eng. 2024 Oct;21(4):6250-6264. doi: 10.1109/tase.2023.3323773. Epub 2023 Oct 23.
Precision medicine aims to provide diagnosis and treatment accounting for individual differences. To develop machine learning models in support of precision medicine, personalized models are expected to have better performance than one-model-fits-all approaches. A significant challenge, however, is the limited number of labeled samples that can be collected from each individual due to practical constraints. Transfer Learning (TL) addresses this challenge by leveraging the information of other patients with the same disease (i.e., the source domain) when building a personalized model for each patient (i.e., the target domain). We propose Weakly-Supervised Transfer Learning (WS-TL) to tackle two challenges that existing TL algorithms do not address well: (i) the target domain has only a few or even no labeled samples; (ii) how to integrate domain knowledge into the TL design. We design a novel mathematical framework of WS-TL to learn a model for the target domain based on paired samples whose order relationships are inferred from domain knowledge, while at the same time integrating labeled samples in the source domain for transfer learning. Also, we propose an efficient active sampling strategy to select informative paired samples. Theoretical properties were investigated. Finally, we present a real-world application in precision medicine of brain cancer, where WS-TL is used to build personalized patient models to predict Tumor Cell Density (TCD) distribution across the brain based on MRI images. WS-TL has the highest accuracy compared to a variety of existing TL algorithms. The predicted TCD map for each patient can help facilitate individually optimized treatment.
精准医学旨在根据个体差异提供诊断和治疗。为了开发支持精准医学的机器学习模型,个性化模型有望比一刀切的方法具有更好的性能。然而,一个重大挑战是由于实际限制,从每个个体收集的标记样本数量有限。迁移学习(TL)通过在为每个患者(即目标域)构建个性化模型时利用患有相同疾病的其他患者(即源域)的信息来应对这一挑战。我们提出弱监督迁移学习(WS-TL)来解决现有TL算法未很好解决的两个挑战:(i)目标域只有少量甚至没有标记样本;(ii)如何将领域知识整合到TL设计中。我们设计了一种新颖的WS-TL数学框架,基于从领域知识推断出顺序关系的配对样本为目标域学习一个模型,同时整合源域中的标记样本进行迁移学习。此外,我们提出了一种高效的主动采样策略来选择信息丰富的配对样本。研究了理论性质。最后,我们展示了WS-TL在脑癌精准医学中的实际应用,其中WS-TL用于基于MRI图像构建个性化患者模型,以预测整个大脑的肿瘤细胞密度(TCD)分布。与各种现有TL算法相比,WS-TL具有最高的准确率。为每个患者预测的TCD图有助于促进个体化的优化治疗。