Suppr超能文献

一种用于对科学文献中扩展的人类突变图谱进行自动突变注释的混合方法。

A hybrid approach for automated mutation annotation of the extended human mutation landscape in scientific literature.

作者信息

Yepes Antonio Jimeno, MacKinlay Andrew, Gunn Natalie, Schieber Christine, Faux Noel, Downton Matthew, Goudey Benjamin, Martin Richard L

机构信息

IBM Research, Southbank, VIC, Australia.

IBM Watson Health, Cambridge, MA, USA.

出版信息

AMIA Annu Symp Proc. 2018 Dec 5;2018:616-623. eCollection 2018.

Abstract

As the cost of DNA sequencing continues to fall, an increasing amount of information on human genetic variation is being produced that could help progress precision medicine. However, information about such mutations is typically first made available in the scientific literature, and is then later manually curated into more standardized genomic databases. This curation process is expensive, time-consuming and many variants do not end up being fully curated, if at all. Detecting mutations in the literature is the first key step towards automating this process. However, most of the current methods have focused on identifying mutations that follow existing nomenclatures. In this work, we show that there is a large number of mutations that are missed by using this standard approach. Furthermore, we implement the first mutation annotator to cover an extended mutation landscape, and we show that its F1 performance is the same performance as human annotation (F1 78.29 for manual annotation vs F1 79.56 for automatic annotation).

摘要

随着DNA测序成本持续下降,越来越多关于人类基因变异的信息被产出,这有助于推进精准医学的发展。然而,此类突变信息通常首先在科学文献中公布,随后再人工整理到更标准化的基因组数据库中。这个整理过程既昂贵又耗时,而且许多变异最终根本没有得到充分整理。在文献中检测突变是实现这一过程自动化的首要关键步骤。然而,目前大多数方法都集中在识别遵循现有命名法的突变上。在这项工作中,我们表明使用这种标准方法会遗漏大量突变。此外,我们实现了首个覆盖扩展突变图谱的突变注释器,并表明其F1性能与人工注释相同(人工注释的F1为78.29,自动注释的F1为79.56)。

相似文献

本文引用的文献

2
SETH detects and normalizes genetic variants in text.SETH可检测并规范文本中的基因变异。
Bioinformatics. 2016 Sep 15;32(18):2883-5. doi: 10.1093/bioinformatics/btw234. Epub 2016 Jun 2.
4
UniProt: a hub for protein information.通用蛋白质数据库(UniProt):蛋白质信息中心。
Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12. doi: 10.1093/nar/gku989. Epub 2014 Oct 27.
8
Annotating the biomedical literature for the human variome.注释人类变异组的生物医学文献。
Database (Oxford). 2013 Apr 12;2013:bat019. doi: 10.1093/database/bat019. Print 2013.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验