IEEE Trans Med Imaging. 2024 Nov;43(11):4004-4016. doi: 10.1109/TMI.2024.3414476. Epub 2024 Nov 4.
Foundation models pretrained on large-scale datasets via self-supervised learning demonstrate exceptional versatility across various tasks. Due to the heterogeneity and hard-to-collect medical data, this approach is especially beneficial for medical image analysis and neuroscience research, as it streamlines broad downstream tasks without the need for numerous costly annotations. However, there has been limited investigation into brain network foundation models, limiting their adaptability and generalizability for broad neuroscience studies. In this study, we aim to bridge this gap. In particular, 1) we curated a comprehensive dataset by collating images from 30 datasets, which comprises 70,781 samples of 46,686 participants. Moreover, we introduce pseudo-functional connectivity (pFC) to further generates millions of augmented brain networks by randomly dropping certain timepoints of the BOLD signal; 2) we propose the BrainMass framework for brain network self-supervised learning via mask modeling and feature alignment. BrainMass employs Mask-ROI Modeling (MRM) to bolster intra-network dependencies and regional specificity. Furthermore, Latent Representation Alignment (LRA) module is utilized to regularize augmented brain networks of the same participant with similar topological properties to yield similar latent representations by aligning their latent embeddings. Extensive experiments on eight internal tasks and seven external brain disorder diagnosis tasks show BrainMass's superior performance, highlighting its significant generalizability and adaptability. Nonetheless, BrainMass demonstrates powerful few/zero-shot learning abilities and exhibits meaningful interpretation to various diseases, showcasing its potential use for clinical applications.
基于大规模数据集的自监督学习预先训练的基础模型在各种任务中表现出非凡的多功能性。由于医疗数据的异质性和难以收集,这种方法特别适用于医学图像分析和神经科学研究,因为它可以简化广泛的下游任务,而无需大量昂贵的注释。然而,针对脑网络基础模型的研究还很有限,限制了它们在广泛的神经科学研究中的适应性和泛化能力。在这项研究中,我们旨在弥合这一差距。具体来说,1)我们通过整理来自 30 个数据集的图像,汇集了一个全面的数据集,其中包含 46686 名参与者的 70781 个样本。此外,我们引入了伪功能连接(pFC),通过随机丢弃 BOLD 信号的某些时间点,进一步生成数百万个增强的脑网络;2)我们提出了 BrainMass 框架,用于通过掩模建模和特征对齐进行脑网络自监督学习。BrainMass 使用掩模-ROI 建模(MRM)来增强网络内的依赖性和区域特异性。此外,利用潜在表示对齐(LRA)模块对同一参与者的增强脑网络进行正则化,通过对齐其潜在嵌入来产生具有相似拓扑属性的相似潜在表示。在八个内部任务和七个外部脑疾病诊断任务上的广泛实验表明,BrainMass 的性能优越,突出了其显著的通用性和适应性。尽管如此,BrainMass 展示了强大的少量/零样本学习能力,并对各种疾病具有有意义的解释,展示了其在临床应用中的潜力。