Butte Atul J, Chen Rong
Stanford Medical Informatics, Department of Medicine and Pediatrics, Stanford University School of Medicine, Stanford, California, USA.
AMIA Annu Symp Proc. 2006;2006:106-10.
The amount of gene expression data in international repositories has grown exponentially. An important first step in translating the results of genomic experiments into medicine is to relate these genomic experiments to the human diseases they have studied. Unfortunately, repositories for expression data store the crucial annotative details only as free-text, making it manually intractable to link these with human disease. In this study, we sought to find experiments in NCBI GEO that are related to human diseases by making use of annotations relating these experiments with PUBMED identifiers representing the publication in which each experiment was published. In this manner, we find that 35% of PUBMED-associated genomic experiments can be related to a human disease, and that publicly-available data from these genomic experiments can already be related to over 270 human diseases and conditions. This represents an important first step in bridging the world of nucleotides, transcripts and expression with the afflications of us all.
国际数据库中的基因表达数据量呈指数级增长。将基因组实验结果转化为医学应用的重要第一步是将这些基因组实验与它们所研究的人类疾病联系起来。不幸的是,表达数据存储库仅将关键的注释细节以自由文本形式存储,这使得将这些细节与人类疾病手动关联变得难以处理。在本研究中,我们试图通过利用将这些实验与代表每个实验发表文献的PubMed标识符相关联的注释,在NCBI GEO中找到与人类疾病相关的实验。通过这种方式,我们发现35%与PubMed相关的基因组实验可以与人类疾病相关联,并且这些基因组实验的公开可用数据已经可以与超过270种人类疾病和病症相关联。这代表了在将核苷酸、转录本和表达的世界与我们所有人所患疾病联系起来方面迈出的重要第一步。