Van Vooren Steven, Thienpont Bernard, Menten Björn, Speleman Frank, De Moor Bart, Vermeesch Joris, Moreau Yves
Department of Electrotechnical Engineering, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, B-3001 Heverlee, Belgium.
Nucleic Acids Res. 2007;35(8):2533-43. doi: 10.1093/nar/gkm054. Epub 2007 Apr 1.
Biomedical literature provides a rich but unstructured source of associations between chromosomal regions and biomedical concepts. By mining MEDLINE abstracts, we annotate the human genome at the level of cytogenetic bands. Our method creates a set of chromosomal aberration maps that associate cytogenetic bands to biomedical concepts from a variety of controlled vocabularies, including disease, dysmorphology, anatomy, development and Gene Ontology branches. The association between a band (e.g. 4p16.3) and a concept (e.g. microcephaly) is assessed by the statistical overrepresentation of this concept in the abstracts relating to this band. Our method is validated using existing genome annotation resources and known chromosomal aberration maps and is further illustrated through a case study on heart disease. Our chromosomal aberration maps provide diagnostics support to clinical geneticists, aid cytogeneticists to interpret and report cytogenetic findings and support researchers interested in human gene function. The method is available as a web application, aBandApart, at http://www.esat.kuleuven.be/abandapart/.
生物医学文献提供了丰富但无结构的染色体区域与生物医学概念之间关联的来源。通过挖掘MEDLINE摘要,我们在细胞遗传学带水平上对人类基因组进行注释。我们的方法创建了一组染色体畸变图谱,将细胞遗传学带与来自各种受控词汇表的生物医学概念相关联,包括疾病、畸形学、解剖学、发育和基因本体分支。一个带(例如4p16.3)与一个概念(例如小头畸形)之间的关联通过该概念在与该带相关的摘要中的统计过度代表性来评估。我们的方法使用现有的基因组注释资源和已知的染色体畸变图谱进行验证,并通过心脏病案例研究进一步说明。我们的染色体畸变图谱为临床遗传学家提供诊断支持,帮助细胞遗传学家解释和报告细胞遗传学发现,并支持对人类基因功能感兴趣的研究人员。该方法可作为一个网络应用程序aBandApart在http://www.esat.kuleuven.be/abandapart/上获取。