Wang Li, Tu Zhidong, Sun Fengzhu
Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089, USA.
BMC Genomics. 2009 May 12;10:220. doi: 10.1186/1471-2164-10-220.
The recently developed RNA interference (RNAi) technology has created an unprecedented opportunity which allows the function of individual genes in whole organisms or cell lines to be interrogated at genome-wide scale. However, multiple issues, such as off-target effects or low efficacies in knocking down certain genes, have produced RNAi screening results that are often noisy and that potentially yield both high rates of false positives and false negatives. Therefore, integrating RNAi screening results with other information, such as protein-protein interaction (PPI), may help to address these issues.
By analyzing 24 genome-wide RNAi screens interrogating various biological processes in Drosophila, we found that RNAi positive hits were significantly more connected to each other when analyzed within a protein-protein interaction network, as opposed to random cases, for nearly all screens. Based on this finding, we developed a network-based approach to identify false positives (FPs) and false negatives (FNs) in these screening results. This approach relied on a scoring function, which we termed NePhe, to integrate information obtained from both PPI network and RNAi screening results. Using a novel rank-based test, we compared the performance of different NePhe scoring functions and found that diffusion kernel-based methods generally outperformed others, such as direct neighbor-based methods. Using two genome-wide RNAi screens as examples, we validated our approach extensively from multiple aspects. We prioritized hits in the original screens that were more likely to be reproduced by the validation screen and recovered potential FNs whose involvements in the biological process were suggested by previous knowledge and mutant phenotypes. Finally, we demonstrated that the NePhe scoring system helped to biologically interpret RNAi results at the module level.
By comprehensively analyzing multiple genome-wide RNAi screens, we conclude that network information can be effectively integrated with RNAi results to produce suggestive FPs and FNs, and to bring biological insight to the screening results.
最近开发的RNA干扰(RNAi)技术创造了前所未有的机会,使得在全基因组范围内研究单个基因在整个生物体或细胞系中的功能成为可能。然而,诸如脱靶效应或敲低某些基因的效率低下等多个问题,导致RNAi筛选结果常常存在噪声,可能产生较高的假阳性和假阴性率。因此,将RNAi筛选结果与其他信息,如蛋白质-蛋白质相互作用(PPI)相结合,可能有助于解决这些问题。
通过分析24个在果蝇中研究各种生物学过程的全基因组RNAi筛选,我们发现,与随机情况相比,在蛋白质-蛋白质相互作用网络中分析时,几乎所有筛选中的RNAi阳性命中彼此之间的联系都更为显著。基于这一发现,我们开发了一种基于网络的方法来识别这些筛选结果中的假阳性(FPs)和假阴性(FNs)。该方法依赖于一个评分函数,我们将其称为NePhe,以整合从PPI网络和RNAi筛选结果中获得的信息。使用一种新颖的基于排名的测试,我们比较了不同NePhe评分函数的性能,发现基于扩散核的方法通常优于其他方法,如基于直接邻居的方法。以两个全基因组RNAi筛选为例,我们从多个方面广泛验证了我们的方法。我们对原始筛选中更有可能被验证筛选重现的命中进行了优先级排序,并找回了潜在的FNs,先前的知识和突变体表型表明它们参与了生物学过程。最后,我们证明了NePhe评分系统有助于在模块水平上对RNAi结果进行生物学解释。
通过全面分析多个全基因组RNAi筛选,我们得出结论,网络信息可以有效地与RNAi结果整合,以产生提示性的FPs和FNs,并为筛选结果带来生物学见解。