Xie Yaochen, Xie Ziqian, Islam Sheikh Muhammad Saiful, Zhi Degui, Ji Shuiwang
ArXiv. 2023 Sep 26:arXiv:2309.15132v1.
Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits. When applied to high-dimensional medical imaging data, a key step is to extract lower-dimensional, yet informative representations of the data as traits. Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS in comparison to typical visual representation learning. In this study, we tackle this problem from the mutual information (MI) perspective by identifying key limitations of existing methods. We introduce a trans-modal learning framework Genetic InfoMax (GIM), including a regularized MI estimator and a novel genetics-informed transformer to address the specific challenges of GWAS. We evaluate GIM on human brain 3D MRI data and establish standardized evaluation protocols to compare it to existing approaches. Our results demonstrate the effectiveness of GIM and a significantly improved performance on GWAS.
全基因组关联研究(GWAS)用于识别基因变异与特定性状之间的关系。当应用于高维医学成像数据时,关键步骤是提取数据的低维但信息丰富的表示作为性状。与典型的视觉表示学习相比,由于GWAS带来的独特挑战,成像遗传学的表示学习在很大程度上尚未得到充分探索。在本研究中,我们通过识别现有方法的关键局限性,从互信息(MI)的角度解决这个问题。我们引入了一个跨模态学习框架遗传信息最大化(GIM),包括一个正则化的MI估计器和一个新颖的基因信息变压器,以应对GWAS的特定挑战。我们在人类大脑3D MRI数据上评估GIM,并建立标准化评估协议以将其与现有方法进行比较。我们的结果证明了GIM的有效性以及在GWAS上显著提高的性能。