Bass Andrew J, Cutler David J, Epstein Michael P
Department of Medicine, University of Cambridge, Cambridge, CB2 0QQ, UK.
Department of Human Genetics, Emory University, Atlanta, GA 30322, USA.
bioRxiv. 2024 Dec 3:2024.11.29.626006. doi: 10.1101/2024.11.29.626006.
Differential co-expression analysis (DCA) aims to identify genes in a pathway whose shared expression depends on a risk factor. While DCA provides insights into the biological activity of diseases, existing methods are limited to categorical risk factors and/or suffer from bias due to batch and variance-specific effects. We propose a new framework, Kernel-based Differential Co-expression Analysis (KDCA), that harnesses correlation patterns between genes in a pathway to detect differential co-expression arising from general (i.e., continuous, discrete, or categorical) risk factors. Using various simulated pathway architectures, we find that KDCA accounts for common sources of bias to control the type I error rate while substantially increasing the power compared to the standard eigengene approach. We then applied KDCA to The Cancer Genome Atlas thyroid data set and found several differentially co-expressed pathways by age of diagnosis and mutation status that were undetected by the eigengene method. Collectively, our results demonstrate that KDCA is a powerful testing framework that expands DCA applications in expression studies.
差异共表达分析(DCA)旨在识别某一通路中其共享表达依赖于风险因素的基因。虽然DCA能深入了解疾病的生物学活性,但现有方法仅限于分类风险因素,并且/或者因批次和方差特异性效应而存在偏差。我们提出了一个新的框架,基于核的差异共表达分析(KDCA),该框架利用通路中基因之间的相关模式来检测由一般(即连续、离散或分类)风险因素引起的差异共表达。使用各种模拟的通路结构,我们发现KDCA考虑了偏差的常见来源,以控制I型错误率,同时与标准特征基因方法相比,显著提高了检验效能。然后我们将KDCA应用于癌症基因组图谱甲状腺数据集,发现了一些按诊断年龄和突变状态差异共表达的通路,而这些通路是特征基因方法未检测到的。总体而言,我们的结果表明KDCA是一个强大的检验框架,扩展了DCA在表达研究中的应用。