Department of Epidemiology, School of Public Health, University of Michigan, 1415 Washington Heights No. 4605, Ann Arbor, MI 48109, USA.
Hum Mol Genet. 2010 Nov 15;19(22):4345-52. doi: 10.1093/hmg/ddq356. Epub 2010 Aug 24.
Epistasis (i.e. gene-gene interaction) has long been recognized as an important mechanism underlying the complexity of the genetic architecture of human traits. Definitions of epistasis range from the purely molecular to the traditional statistical measures of interaction. The statistical detection of epistasis usually does not map onto or easily relate to the biological interactions between genetic variations through their combined influence on gene expression or through their interactions at the gene product (i.e. protein) or DNA level. Recently, greater high-dimensional data on protein-protein interaction (PPI) and gene expression profiles have been collected that enumerates sets of biological interactions. To better align statistical and molecular models of epistasis, we present an example of how to incorporate the PPI information into the statistical analysis of interactions between copy number variations (CNVs). Among the 23 640 pairs of known human PPIs and the 1141 common CNVs detected among HapMap samples, we identified 37 pairs of CNVs overlapping with both genes of a PPI pair. Two CNV pairs provided sufficient genotype variation to search for epistatic effects on gene expression. Using 47 294 probe-specific gene expression levels as the outcomes, five epistatic effects were identified with P-value less than 10(-6). We found a CNV-CNV interaction significantly associated with gene expression of TP53TG3 (P-value of 2 × 10(-20)). The proteins associated with the CNV pair also bind TP53 which regulates the transcription of TP53TG3. This study demonstrates that using PPI data can assist in targeting statistical hypothesis testing to biological plausible epistatic interaction that reflects molecular mechanisms.
上位性(即基因-基因相互作用)长期以来一直被认为是人类性状遗传结构复杂性的重要机制。上位性的定义从纯粹的分子到传统的相互作用统计度量范围很广。上位性的统计检测通常与遗传变异通过其对基因表达的综合影响或通过其在基因产物(即蛋白质)或 DNA 水平的相互作用之间的生物学相互作用不对应或不易相关。最近,已经收集了更多关于蛋白质-蛋白质相互作用(PPI)和基因表达谱的高维数据,枚举了一系列生物学相互作用。为了更好地使上位性的统计和分子模型保持一致,我们提出了一个示例,说明如何将 PPI 信息纳入对拷贝数变异(CNV)之间相互作用的统计分析中。在已知的 23640 对人类 PPI 和 HapMap 样本中检测到的 1141 个常见 CNV 中,我们确定了 37 对与 PPI 对的两个基因都重叠的 CNV。有两对 CNV 提供了足够的基因型变异,可以搜索对基因表达的上位性效应。使用 47294 个探针特异性基因表达水平作为结果,发现了 5 个具有小于 10(-6)的 P 值的上位性效应。我们发现一个 CNV-CNV 相互作用与 TP53TG3 的基因表达显著相关(P 值为 2×10(-20))。与 CNV 对相关的蛋白质也与 TP53 结合,TP53 调节 TP53TG3 的转录。这项研究表明,使用 PPI 数据可以帮助将统计假设检验靶向于反映分子机制的生物学上合理的上位性相互作用。