De Ferrari Luna, Aitken Stuart
School of Informatics, the University of Edinburgh, Edinburgh EH8 9LE, UK.
BMC Genomics. 2006 Oct 30;7:277. doi: 10.1186/1471-2164-7-277.
Traditionally, housekeeping and tissue specific genes have been classified using direct assay of mRNA presence across different tissues, but these experiments are costly and the results not easy to compare and reproduce.
In this work, a Naive Bayes classifier based only on physical and functional characteristics of genes already available in databases, like exon length and measures of chromatin compactness, has achieved a 97% success rate in classification of human housekeeping genes (93% for mouse and 90% for fruit fly).
The newly obtained lists of housekeeping and tissue specific genes adhere to the expected functions and tissue expression patterns for the two classes. Overall, the classifier shows promise, and in the future additional attributes might be included to improve its discriminating power.
传统上,管家基因和组织特异性基因是通过直接检测不同组织中mRNA的存在来分类的,但这些实验成本高昂,结果不易比较和重现。
在这项研究中,一个仅基于数据库中已有基因的物理和功能特征(如外显子长度和染色质紧致度测量值)的朴素贝叶斯分类器,在人类管家基因分类中成功率达到了97%(小鼠为93%,果蝇为90%)。
新获得的管家基因和组织特异性基因列表符合这两类基因预期的功能和组织表达模式。总体而言,该分类器显示出了前景,未来可能会纳入更多属性以提高其判别能力。