IEEE J Biomed Health Inform. 2023 Jun;27(6):3083-3092. doi: 10.1109/JBHI.2023.3264029. Epub 2023 Jun 6.
One of the major goals in gene expression data analysis is to explore and discover groups of genes and groups of biological conditions with meaningful relationships. While this problem can be addressed by algorithms, their results require an analysis within context, since they may be affected by many side processes -such as tissue differentiation- that could hinder the target goal. Visual analytics-based methods for exploratory analysis of the gene expression matrix (GEM) are essential in biomedical research since they allow us to frame the analysis within the user's knowledge domain. In this paper, we present a visual analytics approach to discover relevant connections between genes and samples based on linking a reordered GEM heatmap and dual 2D projections of its rows and columns, which can be recomputed conditioned by subsets of genes and/or samples selected by the user during the analysis. We demonstrate the capability of our approach to discover relevant knowledge in three case studies involving two cancer types plus normal tissue from the TCGA database.
基因表达数据分析的主要目标之一是探索和发现具有有意义关系的基因和生物条件组。虽然这个问题可以通过算法来解决,但是它们的结果需要在上下文内进行分析,因为它们可能会受到许多旁过程的影响,例如组织分化,这可能会阻碍目标的实现。基于可视分析的基因表达矩阵 (GEM) 探索性分析方法在生物医学研究中是必不可少的,因为它们允许我们在用户的知识领域内构建分析。在本文中,我们提出了一种可视分析方法,基于链接重新排序的 GEM 热图和其行和列的双二维投影,来发现基因和样本之间的相关连接,用户可以在分析过程中根据选择的基因和/或样本子集重新计算这些连接。我们通过三个涉及 TCGA 数据库中两种癌症类型加正常组织的案例研究展示了我们的方法发现相关知识的能力。