Chen Jing, Bardes Eric E, Aronow Bruce J, Jegga Anil G
Department of Environmental Health, University of Cincinnati, Cincinnati, OH, USA.
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W305-11. doi: 10.1093/nar/gkp427. Epub 2009 May 22.
ToppGene Suite (http://toppgene.cchmc.org; this web site is free and open to all users and does not require a login to access) is a one-stop portal for (i) gene list functional enrichment, (ii) candidate gene prioritization using either functional annotations or network analysis and (iii) identification and prioritization of novel disease candidate genes in the interactome. Functional annotation-based disease candidate gene prioritization uses a fuzzy-based similarity measure to compute the similarity between any two genes based on semantic annotations. The similarity scores from individual features are combined into an overall score using statistical meta-analysis. A P-value of each annotation of a test gene is derived by random sampling of the whole genome. The protein-protein interaction network (PPIN)-based disease candidate gene prioritization uses social and Web networks analysis algorithms (extended versions of the PageRank and HITS algorithms, and the K-Step Markov method). We demonstrate the utility of ToppGene Suite using 20 recently reported GWAS-based gene-disease associations (including novel disease genes) representing five diseases. ToppGene ranked 19 of 20 (95%) candidate genes within the top 20%, while ToppNet ranked 12 of 16 (75%) candidate genes among the top 20%.
ToppGene Suite(http://toppgene.cchmc.org;该网站免费向所有用户开放,无需登录即可访问)是一个一站式门户,用于(i)基因列表功能富集,(ii)使用功能注释或网络分析对候选基因进行优先级排序,以及(iii)在相互作用组中识别新型疾病候选基因并对其进行优先级排序。基于功能注释的疾病候选基因优先级排序使用基于模糊的相似性度量,根据语义注释计算任意两个基因之间的相似性。使用统计元分析将各个特征的相似性得分合并为一个总体得分。通过对整个基因组进行随机抽样得出测试基因每个注释的P值。基于蛋白质-蛋白质相互作用网络(PPIN)的疾病候选基因优先级排序使用社交和网络分析算法(PageRank和HITS算法的扩展版本以及K步马尔可夫方法)。我们使用代表五种疾病的20个最近报道的基于全基因组关联研究(GWAS)的基因-疾病关联(包括新型疾病基因)来证明ToppGene Suite的实用性。ToppGene在20个候选基因中的19个(95%)排名在前20%以内,而ToppNet在16个候选基因中的12个(75%)排名在前20%以内。