National Center for Biotechnology Information.
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa142.
To obtain key information for personalized medicine and cancer research, clinicians and researchers in the biomedical field are in great need of searching genomic variant information from the biomedical literature now than ever before. Due to the various written forms of genomic variants, however, it is difficult to locate the right information from the literature when using a general literature search system. To address the difficulty of locating genomic variant information from the literature, researchers have suggested various solutions based on automated literature-mining techniques. There is, however, no study for summarizing and comparing existing tools for genomic variant literature mining in terms of how to search easily for information in the literature on genomic variants.
In this article, we systematically compared currently available genomic variant recognition and normalization tools as well as the literature search engines that adopted these literature-mining techniques. First, we explain the problems that are caused by the use of non-standard formats of genomic variants in the PubMed literature by considering examples from the literature and show the prevalence of the problem. Second, we review literature-mining tools that address the problem by recognizing and normalizing the various forms of genomic variants in the literature and systematically compare them. Third, we present and compare existing literature search engines that are designed for a genomic variant search by using the literature-mining techniques. We expect this work to be helpful for researchers who seek information about genomic variants from the literature, developers who integrate genomic variant information from the literature and beyond.
为了获取个性化医学和癌症研究的关键信息,生物医学领域的临床医生和研究人员现在比以往任何时候都更需要从生物医学文献中搜索基因组变异信息。然而,由于基因组变异的各种书写形式,使用一般的文献检索系统很难从文献中找到正确的信息。为了解决从文献中定位基因组变异信息的困难,研究人员基于自动化文献挖掘技术提出了各种解决方案。然而,目前还没有研究从文献中挖掘基因组变异信息的现有工具,以总结和比较如何轻松搜索关于基因组变异的文献中的信息。
在本文中,我们系统地比较了当前可用的基因组变异识别和标准化工具,以及采用这些文献挖掘技术的文献搜索引擎。首先,我们通过考虑文献中的示例来解释在 PubMed 文献中使用非标准格式的基因组变异所导致的问题,并展示该问题的普遍性。其次,我们回顾了通过识别和标准化文献中的各种形式的基因组变异来解决该问题的文献挖掘工具,并对其进行了系统比较。第三,我们介绍并比较了现有的旨在通过使用文献挖掘技术搜索基因组变异的文献搜索引擎。我们希望这项工作对从文献中寻找基因组变异信息的研究人员、从文献中整合基因组变异信息的开发者以及其他开发者有所帮助。