Max Planck Institute for Informatics, Department of Computational Biology and Applied Algorithmics, Saarbrücken, Germany.
Bioinformatics. 2010 Sep 15;26(18):i561-7. doi: 10.1093/bioinformatics/btq384.
Many hereditary human diseases are polygenic, resulting from sequence alterations in multiple genes. Genomic linkage and association studies are commonly performed for identifying disease-related genes. Such studies often yield lists of up to several hundred candidate genes, which have to be prioritized and validated further. Recent studies discovered that genes involved in phenotypically similar diseases are often functionally related on the molecular level.
Here, we introduce MedSim, a novel approach for ranking candidate genes for a particular disease based on functional comparisons involving the Gene Ontology. MedSim uses functional annotations of known disease genes for assessing the similarity of diseases as well as the disease relevance of candidate genes. We benchmarked our approach with genes known to be involved in 99 diseases taken from the OMIM database. Using artificial quantitative trait loci, MedSim achieved excellent performance with an area under the ROC curve of up to 0.90 and a sensitivity of over 70% at 90% specificity when classifying gene products according to their disease relatedness. This performance is comparable or even superior to related methods in the field, albeit using less and thus more easily accessible information.
MedSim is offered as part of our FunSimMat web service (http://www.funsimmat.de).
许多遗传性人类疾病是多基因的,由多个基因的序列改变引起。基因组连锁和关联研究常用于识别与疾病相关的基因。此类研究通常会产生多达数百个候选基因的列表,这些基因需要进一步优先排序和验证。最近的研究发现,表型相似疾病中涉及的基因在分子水平上通常具有功能相关性。
在这里,我们介绍了 MedSim,这是一种基于涉及基因本体论的功能比较对特定疾病的候选基因进行排名的新方法。MedSim 使用已知疾病基因的功能注释来评估疾病的相似性以及候选基因与疾病的相关性。我们使用来自 OMIM 数据库的 99 种已知与疾病相关的基因对我们的方法进行了基准测试。使用人工定量性状基因座,MedSim 在根据疾病相关性对基因产物进行分类时,达到了高达 0.90 的 ROC 曲线下面积和超过 70%的灵敏度,特异性为 90%。尽管使用的信息更少且更容易获得,但这种性能与该领域的相关方法相当甚至更好。
MedSim 作为我们的 FunSimMat 网络服务的一部分提供(http://www.funsimmat.de)。