Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, People's Republic of China.
OMICS. 2010 Aug;14(4):337-56. doi: 10.1089/omi.2009.0143.
Identifying disease genes is very important not only for better understanding of gene function and biological process but also for human medical improvement. Many computational methods have been proposed based on the similarity between all known disease genes (seed genes) and candidate genes in the entire gene interaction network. Under the hypothesis that potential disease-related genes should be near the seed genes in the network and only the seed genes that are located in the same module with the candidate genes will contribute to disease genes prediction, three modularized candidate disease gene prioritization algorithms (MCDGPAs) are proposed to identify disease-related genes. MCDGPA is divided into three steps: module partition, genes prioritization in each disease-associated module, and rank fusion for the global ranking. When applied to the prostate cancer and breast cancer network, MCDGPA significantly improves previous algorithms in terms of cross-validation and disease-related genes prediction. In addition, the improvement is robust to the selection of gene prioritization methods when implementing prioritization in each disease-associated module and module partition algorithms when implementing network partition. In this sense MCDGPA is a general framework that allows integrating many previous gene prioritization methods and improving predictive accuracy.
识别疾病基因不仅对于更好地理解基因功能和生物过程非常重要,而且对于人类医学的进步也非常重要。已经提出了许多基于所有已知疾病基因(种子基因)和整个基因相互作用网络中的候选基因之间的相似性的计算方法。基于这样的假设,即潜在的疾病相关基因应该在网络中的种子基因附近,并且只有位于与候选基因相同模块中的种子基因才会有助于疾病基因预测,因此提出了三种模块化候选疾病基因优先级算法(MCDGPA)来识别疾病相关基因。MCDGPA 分为三个步骤:模块划分、每个疾病相关模块中的基因优先级以及全局排名的排名融合。当应用于前列腺癌和乳腺癌网络时,MCDGPA 在交叉验证和疾病相关基因预测方面显著优于以前的算法。此外,当在每个疾病相关模块中实施优先级排序以及在网络分区时实施模块分区算法时,这种改进对于基因优先级排序方法的选择是稳健的。从这个意义上说,MCDGPA 是一个通用框架,允许集成许多以前的基因优先级排序方法并提高预测准确性。