Wang Yishu, Yang Dejie, Deng Minghua
Center for Quantitative Biology, Peking University, Beijing 100871, China.
Institute of Computing Technology, Chinese Academy of Science, Beijing 100190, China.
Biomed Res Int. 2015;2015:573956. doi: 10.1155/2015/573956. Epub 2015 Jul 26.
Epistatic miniarray profile (EMAP) studies have enabled the mapping of large-scale genetic interaction networks and generated large amounts of data in model organisms. One approach to analyze EMAP data is to identify gene modules with densely interacting genes. In addition, genetic interaction score (S score) reflects the degree of synergizing or mitigating effect of two mutants, which is also informative. Statistical approaches that exploit both modularity and the pairwise interactions may provide more insight into the underlying biology. However, the high missing rate in EMAP data hinders the development of such approaches. To address the above problem, we adopted the matrix decomposition methodology "low-rank and sparse decomposition" (LRSDec) to decompose EMAP data matrix into low-rank part and sparse part.
LRSDec has been demonstrated as an effective technique for analyzing EMAP data. We applied a synthetic dataset and an EMAP dataset studying RNA-related processes in Saccharomyces cerevisiae. Global views of the genetic cross talk between different RNA-related protein complexes and processes have been structured, and novel functions of genes have been predicted.
上位性微阵列图谱(EMAP)研究已能够绘制大规模遗传相互作用网络,并在模式生物中产生了大量数据。分析EMAP数据的一种方法是识别具有密集相互作用基因的基因模块。此外,遗传相互作用分数(S分数)反映了两个突变体协同或缓解效应的程度,这也具有参考价值。利用模块性和成对相互作用的统计方法可能会为潜在生物学提供更多见解。然而,EMAP数据中的高缺失率阻碍了此类方法的发展。为了解决上述问题,我们采用矩阵分解方法“低秩和稀疏分解”(LRSDec)将EMAP数据矩阵分解为低秩部分和稀疏部分。
LRSDec已被证明是分析EMAP数据的有效技术。我们应用了一个合成数据集和一个研究酿酒酵母中RNA相关过程的EMAP数据集。构建了不同RNA相关蛋白复合物和过程之间遗传相互作用的全局视图,并预测了基因的新功能。