School of Medicine, Deakin University, Geelong, VIC 3217, Australia.
BMC Bioinformatics. 2013 Aug 16;14:249. doi: 10.1186/1471-2105-14-249.
Candidate disease gene prediction is a rapidly developing area of bioinformatics research with the potential to deliver great benefits to human health. As experimental studies detecting associations between genetic intervals and disease proliferate, better bioinformatic techniques that can expand and exploit the data are required.
Gentrepid is a web resource which predicts and prioritizes candidate disease genes for both Mendelian and complex diseases. The system can take input from linkage analysis of single genetic intervals or multiple marker loci from genome-wide association studies. The underlying database of the Gentrepid tool sources data from numerous gene and protein resources, taking advantage of the wealth of biological information available. Using known disease gene information from OMIM, the system predicts and prioritizes disease gene candidates that participate in the same protein pathways or share similar protein domains. Alternatively, using an ab initio approach, the system can detect enrichment of these protein annotations without prior knowledge of the phenotype.
The system aims to integrate the wealth of protein information currently available with known and novel phenotype/genotype information to acquire knowledge of biological mechanisms underpinning disease. We have updated the system to facilitate analysis of GWAS data and the study of complex diseases. Application of the system to GWAS data on hypertension using the ICBP data is provided as an example. An interesting prediction is a ZIP transporter additional to the one found by the ICBP analysis. The webserver URL is https://www.gentrepid.org/.
候选疾病基因预测是生物信息学研究中一个迅速发展的领域,有可能为人类健康带来巨大的益处。随着检测遗传区间与疾病之间关联的实验研究不断增多,需要更好的生物信息学技术来扩展和利用这些数据。
Gentrepid 是一个网络资源,用于预测和优先考虑孟德尔和复杂疾病的候选疾病基因。该系统可以接受来自单遗传区间的连锁分析或全基因组关联研究中多个标记基因座的输入。Gentrepid 工具的基础数据库来源于众多基因和蛋白质资源,利用了现有的丰富生物信息。利用 OMIM 中的已知疾病基因信息,该系统预测和优先考虑参与相同蛋白质途径或具有相似蛋白质结构域的疾病基因候选物。或者,使用从头开始的方法,该系统可以在没有先验表型知识的情况下检测这些蛋白质注释的富集。
该系统旨在整合目前可用的丰富蛋白质信息与已知和新颖的表型/基因型信息,以获取疾病潜在生物学机制的知识。我们已经更新了该系统,以方便分析 GWAS 数据和研究复杂疾病。以 ICBP 数据为例,提供了应用该系统分析高血压 GWAS 数据的结果。一个有趣的预测是除了 ICBP 分析发现的一个 ZIP 转运体外,还有另一个 ZIP 转运体。该网络服务器的网址是 https://www.gentrepid.org/。