Suppr超能文献

利用数据挖掘技术将基因与遗传疾病相关联。

Association of genes to genetically inherited diseases using data mining.

作者信息

Perez-Iratxeta Carolina, Bork Peer, Andrade Miguel A

机构信息

European Molecular Biology Laboratory, Meyerhofstr.1, Heidelberg 69012, Germany.

出版信息

Nat Genet. 2002 Jul;31(3):316-9. doi: 10.1038/ng895. Epub 2002 May 13.

Abstract

Although approximately one-quarter of the roughly 4,000 genetically inherited diseases currently recorded in respective databases (LocusLink, OMIM) are already linked to a region of the human genome, about 450 have no known associated gene. Finding disease-related genes requires laborious examination of hundreds of possible candidate genes (sometimes, these are not even annotated; see, for example, refs 3,4). The public availability of the human genome draft sequence has fostered new strategies to map molecular functional features of gene products to complex phenotypic descriptions, such as those of genetically inherited diseases. Owing to recent progress in the systematic annotation of genes using controlled vocabularies, we have developed a scoring system for the possible functional relationships of human genes to 455 genetically inherited diseases that have been mapped to chromosomal regions without assignment of a particular gene. In a benchmark of the system with 100 known disease-associated genes, the disease-associated gene was among the 8 best-scoring genes with a 25% chance, and among the best 30 genes with a 50% chance, showing that there is a relationship between the score of a gene and its likelihood of being associated with a particular disease. The scoring also indicates that for some diseases, the chance of identifying the underlying gene is higher.

摘要

尽管在各个数据库(LocusLink、OMIM)中目前记录的约4000种基因遗传性疾病中,大约四分之一已经与人类基因组的一个区域相关联,但仍有大约450种疾病没有已知的相关基因。寻找与疾病相关的基因需要对数百个可能的候选基因进行费力的检测(有时,这些基因甚至没有注释;例如,参见参考文献3、4)。人类基因组草图序列的公开促使人们采用新的策略,将基因产物的分子功能特征映射到复杂的表型描述上,比如基因遗传性疾病的表型描述。由于最近在使用受控词汇对基因进行系统注释方面取得的进展,我们针对人类基因与455种已定位到染色体区域但尚未确定特定基因的基因遗传性疾病之间可能的功能关系,开发了一种评分系统。在对该系统进行的一项包含100个已知疾病相关基因的基准测试中,疾病相关基因有25%的概率位列得分最高的8个基因之中,有50%的概率位列得分最高的30个基因之中,这表明基因的得分与其与特定疾病相关的可能性之间存在关联。评分还表明,对于某些疾病,识别潜在基因的可能性更高。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验