van Driel Marc A, Bruggeman Jorn, Vriend Gert, Brunner Han G, Leunissen Jack A M
Centre for Molecular and Biomolecular Informatics, Radboud University Nijmegen, Toernooiveld 1, 6525ED Nijmegen, the Netherlands.
Eur J Hum Genet. 2006 May;14(5):535-42. doi: 10.1038/sj.ejhg.5201585.
A number of large-scale efforts are underway to define the relationships between genes and proteins in various species. But, few attempts have been made to systematically classify all such relationships at the phenotype level. Also, it is unknown whether such a phenotype map would carry biologically meaningful information. We have used text mining to classify over 5000 human phenotypes contained in the Online Mendelian Inheritance in Man database. We find that similarity between phenotypes reflects biological modules of interacting functionally related genes. These similarities are positively correlated with a number of measures of gene function, including relatedness at the level of protein sequence, protein motifs, functional annotation, and direct protein-protein interaction. Phenotype grouping reflects the modular nature of human disease genetics. Thus, phenotype mapping may be used to predict candidate genes for diseases as well as functional relations between genes and proteins. Such predictions will further improve if a unified system of phenotype descriptors is developed. The phenotype similarity data are accessible through a web interface at http://www.cmbi.ru.nl/MimMiner/.
目前正在开展多项大规模工作,以确定各种物种中基因与蛋白质之间的关系。但是,几乎没有人尝试在表型水平上对所有此类关系进行系统分类。此外,尚不清楚这样的表型图谱是否会携带具有生物学意义的信息。我们利用文本挖掘技术对《人类孟德尔遗传在线》数据库中包含的5000多种人类表型进行了分类。我们发现,表型之间的相似性反映了功能相关基因相互作用的生物学模块。这些相似性与多种基因功能指标呈正相关,包括蛋白质序列水平的相关性、蛋白质基序、功能注释以及直接的蛋白质-蛋白质相互作用。表型分组反映了人类疾病遗传学的模块化性质。因此,表型图谱可用于预测疾病的候选基因以及基因与蛋白质之间的功能关系。如果开发出一个统一的表型描述符系统,此类预测将得到进一步改善。表型相似性数据可通过网页界面在http://www.cmbi.ru.nl/MimMiner/上获取。