Kondrashov Fyodor A, Ogurtsov Aleksey Y, Kondrashov Alexey S
National Center for Biotechnology Information, National Institutes of Health, 38a Center Drive, 6S602, Bethesda, MD 20892, USA.
Nucleic Acids Res. 2004 Mar 12;32(5):1731-7. doi: 10.1093/nar/gkh330. Print 2004.
Only a fraction of eukaryotic genes affect the phenotype drastically. We compared 18 parameters in 1273 human morbid genes, known to cause diseases, and in the remaining 16 580 unambiguous human genes. Morbid genes evolve more slowly, have wider phylogenetic distributions, are more similar to essential genes of Drosophila melanogaster, code for longer proteins containing more alanine and glycine and less histidine, lysine and methionine, possess larger numbers of longer introns with more accurate splicing signals and have higher and broader expressions. These differences make it possible to classify as non-morbid 34% of human genes with unknown morbidity, when only 5% of known morbid genes are incorrectly classified as non-morbid. This classification can help to identify disease-causing genes among multiple candidates.
只有一小部分真核基因会对表型产生显著影响。我们比较了1273个已知会导致疾病的人类致病基因和其余16580个明确的人类基因中的18个参数。致病基因进化得更慢,系统发育分布更广,与黑腹果蝇的必需基因更相似,编码的蛋白质更长,含有更多的丙氨酸和甘氨酸,而组氨酸、赖氨酸和蛋氨酸较少,拥有更多数量的更长内含子,剪接信号更准确,并且具有更高和更广泛的表达。这些差异使得在未知发病情况的人类基因中,有34%可被归类为非致病基因,而只有5%的已知致病基因被错误地归类为非致病基因。这种分类有助于在多个候选基因中识别致病基因。