Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences and Max Planck Society Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 200031 Shanghai, China.
Hum Mol Genet. 2012 Apr 1;21(7):1611-24. doi: 10.1093/hmg/ddr599. Epub 2011 Dec 20.
Traditionally, genetic disorders have been classified as either Mendelian diseases or complex diseases. This nosology has greatly benefited genetic counseling and the development of gene mapping strategies. However, based on two well-established databases, we identified that 54% (524 of 968) of the Mendelian disease genes were also involved in complex diseases, and this kind of genes has not been systematically analyzed. Here, we classified human genes into five categories: Mendelian and complex disease (MC) genes, Mendelian but not complex disease (MNC) genes, complex but not Mendelian disease (CNM) genes, essential genes and OTHER genes. First, we found that MC genes were associated with more diseases and phenotypes, and were involved in more complex protein-protein interaction network than MNC or CNM genes on average. Secondly, MC genes encoded the longest proteins and had the highest transcript count among all gene categories. Especially, tissue specificity of MC genes was much higher than that of any other gene categories (P < 7.5 × 10(-5)), although their expression level was similar to that of essential genes. Thirdly, evidences from different aspects supported that MC genes have been subjected to both purifying and positive selection. Interestingly, functions of some human disease genes might be different from those of their orthologous genes in non-primate mammalians since they were even less conserved than OTHER genes. The significant over-representation of copy number variations (CNVs) in CNM genes suggested the important roles of CNVs in complex diseases. In brief, our study not only revealed the characteristics of MC genes, but also provided new insights into the other four gene categories.
传统上,遗传疾病被分为孟德尔疾病或复杂疾病。这种分类法极大地促进了遗传咨询和基因定位策略的发展。然而,基于两个成熟的数据库,我们发现 54%(968 个中的 524 个)的孟德尔疾病基因也与复杂疾病有关,而这种基因尚未得到系统分析。在这里,我们将人类基因分为五类:孟德尔和复杂疾病(MC)基因、孟德尔但非复杂疾病(MNC)基因、复杂但非孟德尔疾病(CNM)基因、必需基因和其他基因。首先,我们发现 MC 基因与更多的疾病和表型相关,并且平均而言,它们参与的蛋白质-蛋白质相互作用网络比 MNC 或 CNM 基因更复杂。其次,MC 基因编码的蛋白质最长,转录本数量在所有基因类别中最高。特别是,MC 基因的组织特异性比任何其他基因类别都高得多(P < 7.5×10(-5)),尽管它们的表达水平与必需基因相似。第三,来自不同方面的证据表明,MC 基因既受到纯化选择又受到正选择。有趣的是,一些人类疾病基因的功能可能与非灵长类哺乳动物的同源基因不同,因为它们的保守性甚至低于其他基因。CNM 基因中大量存在的拷贝数变异(CNVs)表明 CNVs 在复杂疾病中的重要作用。总之,我们的研究不仅揭示了 MC 基因的特征,还为其他四个基因类别提供了新的见解。