一种基于相似性的全基因组预测疾病相关人类基因的方法。

A similarity-based method for genome-wide prediction of disease-relevant human genes.

作者信息

Freudenberg J, Propping P

机构信息

Institute of Human Genetics, Bonn University Hospital, Germany.

出版信息

Bioinformatics. 2002;18 Suppl 2:S110-5. doi: 10.1093/bioinformatics/18.suppl_2.s110.

DOI:10.1093/bioinformatics/18.suppl_2.s110

PMID:12385992

Abstract

MOTIVATION

A method for prediction of disease relevant human genes from the phenotypic appearance of a query disease is presented. Diseases of known genetic origin are clustered according to their phenotypic similarity. Each cluster entry consists of a disease and its underlying disease gene. Potential disease genes from the human genome are scored by their functional similarity to known disease genes in these clusters, which are phenotypically similar to the query disease.

RESULTS

For assessment of the approach, a leave-one-out cross-validation of 878 diseases from the OMIM database, using 10672 candidate genes from the human genome, is performed. Depending on the applied parameters, in roughly one-third of cases the true solution is contained within the top scoring 3% of predictions and in two-third of cases the true solution is contained within the top scoring 15% of predictions. The prediction results can either be used to identify target genes, when searching for a mutation in monogenic diseases or for selection of loci in genotyping experiments in genetically complex diseases.

摘要

动机

提出了一种从查询疾病的表型外观预测与疾病相关的人类基因的方法。已知遗传起源的疾病根据其表型相似性进行聚类。每个聚类条目由一种疾病及其潜在的疾病基因组成。人类基因组中的潜在疾病基因通过它们与这些聚类中已知疾病基因的功能相似性进行评分，这些已知疾病基因在表型上与查询疾病相似。

结果

为了评估该方法，使用来自人类基因组的10672个候选基因对来自OMIM数据库的878种疾病进行了留一法交叉验证。根据应用的参数，在大约三分之一的情况下，真正的解决方案包含在得分最高的3%的预测中，在三分之二的情况下，真正的解决方案包含在得分最高的15%的预测中。当在单基因疾病中寻找突变或在遗传复杂疾病的基因分型实验中选择位点时，预测结果可用于识别靶基因。