Rindflesch Thomas C, Blake Catherine L, Fiszman Marcelo, Kilicoglu Halil, Rosemblat Graciela, Schneider Jodi, Zeiss Caroline J
Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, Maryland. School of Information Sciences, University of Illinois, Urbana-Champaign; Center for Informatics in Science and Scholarship. Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, Maryland. Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, Maryland. Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, Maryland. School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, Illinois. Yale University School of Medicine, New Haven, Connecticut.
ILAR J. 2017 Jul 1;58(1):80-89. doi: 10.1093/ilar/ilx004.
Informatics methodologies exploit computer-assisted techniques to help biomedical researchers manage large amounts of information. In this paper, we focus on the biomedical research literature (MEDLINE). We first provide an overview of some text mining techniques that offer assistance in research by identifying biomedical entities (e.g., genes, substances, and diseases) and relations between them in text.We then discuss Semantic MEDLINE, an application that integrates PubMed document retrieval, concept and relation identification, and visualization, thus enabling a user to explore concepts and relations from within a set of retrieved citations. Semantic MEDLINE provides a roadmap through content and helps users discern patterns in large numbers of retrieved citations. We illustrate its use with an informatics method we call "discovery browsing," which provides a principled way of navigating through selected aspects of some biomedical research area. The method supports an iterative process that accommodates learning and hypothesis formation in which a user is provided with high level connections before delving into details.As a use case, we examine current developments in basic research on mechanisms of Alzheimer's disease. Out of the nearly 90 000 citations returned by the PubMed query "Alzheimer's disease," discovery browsing led us to 73 citations on sortilin and that disorder. We provide a synopsis of the basic research reported in 15 of these. There is wide-spread consensus among researchers working with a range of animal models and human cells that increased sortilin expression and decreased receptor expression are associated with amyloid beta and/or amyloid precursor protein.
信息学方法利用计算机辅助技术来帮助生物医学研究人员管理大量信息。在本文中,我们聚焦于生物医学研究文献(MEDLINE)。我们首先概述一些文本挖掘技术,这些技术通过识别文本中的生物医学实体(如基因、物质和疾病)及其之间的关系来为研究提供帮助。然后我们讨论语义MEDLINE,这是一个集成了PubMed文档检索、概念和关系识别以及可视化的应用程序,从而使用户能够在一组检索到的文献中探索概念和关系。语义MEDLINE提供了一条贯穿内容的路线图,并帮助用户在大量检索到的文献中辨别模式。我们用一种我们称为“发现浏览”的信息学方法来说明其用法,该方法提供了一种有原则的方式来浏览某些生物医学研究领域的选定方面。该方法支持一个迭代过程,这个过程适应学习和假设形成,在深入细节之前为用户提供高层次的联系。作为一个用例,我们研究了阿尔茨海默病发病机制基础研究的当前进展。在PubMed查询“阿尔茨海默病”返回的近90000篇文献中,发现浏览引导我们找到了73篇关于sortilin与该疾病的文献。我们提供了其中15篇文献所报道的基础研究的概要。在使用一系列动物模型和人类细胞进行研究的研究人员中,广泛达成的共识是sortilin表达增加和受体表达降低与β淀粉样蛋白和/或淀粉样前体蛋白有关。