• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
A hybrid approach for automated mutation annotation of the extended human mutation landscape in scientific literature.一种用于对科学文献中扩展的人类突变图谱进行自动突变注释的混合方法。
AMIA Annu Symp Proc. 2018 Dec 5;2018:616-623. eCollection 2018.
2
Text mining for precision medicine: automating disease-mutation relationship extraction from biomedical literature.精准医学的文本挖掘:从生物医学文献中自动提取疾病-突变关系
J Am Med Inform Assoc. 2016 Jul;23(4):766-72. doi: 10.1093/jamia/ocw041. Epub 2016 Apr 27.
3
Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine.从生物医学文献中挖掘基因型-表型关系以用于数据库管理和精准医学。
PLoS Comput Biol. 2016 Nov 30;12(11):e1005017. doi: 10.1371/journal.pcbi.1005017. eCollection 2016 Nov.
4
tmVar: a text mining approach for extracting sequence variants in biomedical literature.tmVar:一种从生物医学文献中提取序列变异的文本挖掘方法。
Bioinformatics. 2013 Jun 1;29(11):1433-9. doi: 10.1093/bioinformatics/btt156. Epub 2013 Apr 5.
5
DaMold: A data-mining platform for variant annotation and visualization in molecular diagnostics research.DaMold:一个用于分子诊断研究中变异注释和可视化的数据挖掘平台。
Hum Mutat. 2017 Jul;38(7):778-787. doi: 10.1002/humu.23227. Epub 2017 May 30.
6
A machine-compiled database of genome-wide association studies.一个基于机器编译的全基因组关联研究数据库。
Nat Commun. 2019 Jul 26;10(1):3341. doi: 10.1038/s41467-019-11026-x.
7
tmVar 2.0: integrating genomic variant information from literature with dbSNP and ClinVar for precision medicine.tmVar 2.0:整合文献中的基因组变异信息与 dbSNP 和 ClinVar,以用于精准医学。
Bioinformatics. 2018 Jan 1;34(1):80-87. doi: 10.1093/bioinformatics/btx541.
8
A Text Mining Pipeline Using Active and Deep Learning Aimed at Curating Information in Computational Neuroscience.使用主动和深度学习的文本挖掘管道,旨在为计算神经科学中的信息提供支持。
Neuroinformatics. 2019 Jul;17(3):391-406. doi: 10.1007/s12021-018-9404-y.
9
Shared resources, shared costs--leveraging biocuration resources.共享资源,分担成本——利用生物编目资源。
Database (Oxford). 2015 Mar 16;2015. doi: 10.1093/database/bav009. Print 2015.
10
A mutation-centric approach to identifying pharmacogenomic relations in text.基于突变的方法识别文本中的药物基因组学关系。
J Biomed Inform. 2012 Oct;45(5):835-41. doi: 10.1016/j.jbi.2012.05.003. Epub 2012 Jun 7.

引用本文的文献

1
JCBIE: a joint continual learning neural network for biomedical information extraction.JCBIE:一种用于生物医学信息提取的联合持续学习神经网络。
BMC Bioinformatics. 2022 Dec 19;23(1):549. doi: 10.1186/s12859-022-05096-w.
2
Accelerated variant curation from scientific literature using biomedical text mining.利用生物医学文本挖掘技术从科学文献中加速变异注释
MicroPubl Biol. 2022 Jun 1;2022. doi: 10.17912/micropub.biology.000578. eCollection 2022.
3
Unique insights from ClinicalTrials.gov by mining protein mutations and RSids in addition to applying the Human Phenotype Ontology.通过挖掘蛋白质突变和 RSids 并应用人类表型本体,从 ClinicalTrials.gov 获得独特的见解。
PLoS One. 2020 May 27;15(5):e0233438. doi: 10.1371/journal.pone.0233438. eCollection 2020.

本文引用的文献

1
ClearTK 2.0: Design Patterns for Machine Learning in UIMA.ClearTK 2.0:UIMA中机器学习的设计模式
LREC Int Conf Lang Resour Eval. 2014 May;2014:3289-3293.
2
SETH detects and normalizes genetic variants in text.SETH可检测并规范文本中的基因变异。
Bioinformatics. 2016 Sep 15;32(18):2883-5. doi: 10.1093/bioinformatics/btw234. Epub 2016 Jun 2.
3
Gene: a gene-centered information resource at NCBI.基因:美国国立医学图书馆国家生物技术信息中心的一个以基因为中心的信息资源库。
Nucleic Acids Res. 2015 Jan;43(Database issue):D36-42. doi: 10.1093/nar/gku1055. Epub 2014 Oct 29.
4
UniProt: a hub for protein information.通用蛋白质数据库(UniProt):蛋白质信息中心。
Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12. doi: 10.1093/nar/gku989. Epub 2014 Oct 27.
5
Mutation extraction tools can be combined for robust recognition of genetic variants in the literature.突变提取工具可以组合起来,以便在文献中对基因变异进行可靠识别。
F1000Res. 2014 Jan 21;3:18. doi: 10.12688/f1000research.3-18.v2. eCollection 2014.
6
Chemical name extraction based on automatic training data generation and rich feature set.基于自动训练数据生成和丰富特征集的化学名称提取
IEEE/ACM Trans Comput Biol Bioinform. 2013 Sep-Oct;10(5):1218-33. doi: 10.1109/TCBB.2013.101.
7
Cytogenetic Nomenclature: Changes in the ISCN 2013 Compared to the 2009 Edition.细胞遗传学命名法:与2009年版相比,《人类细胞遗传学国际命名体制(2013)》的变化
Cytogenet Genome Res. 2013;141(1):1-6. doi: 10.1159/000353118.
8
Annotating the biomedical literature for the human variome.注释人类变异组的生物医学文献。
Database (Oxford). 2013 Apr 12;2013:bat019. doi: 10.1093/database/bat019. Print 2013.
9
tmVar: a text mining approach for extracting sequence variants in biomedical literature.tmVar:一种从生物医学文献中提取序列变异的文本挖掘方法。
Bioinformatics. 2013 Jun 1;29(11):1433-9. doi: 10.1093/bioinformatics/btt156. Epub 2013 Apr 5.
10
COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer.COSMIC:在癌症体细胞突变目录中挖掘完整的癌症基因组。
Nucleic Acids Res. 2011 Jan;39(Database issue):D945-50. doi: 10.1093/nar/gkq929. Epub 2010 Oct 15.

一种用于对科学文献中扩展的人类突变图谱进行自动突变注释的混合方法。

A hybrid approach for automated mutation annotation of the extended human mutation landscape in scientific literature.

作者信息

Yepes Antonio Jimeno, MacKinlay Andrew, Gunn Natalie, Schieber Christine, Faux Noel, Downton Matthew, Goudey Benjamin, Martin Richard L

机构信息

IBM Research, Southbank, VIC, Australia.

IBM Watson Health, Cambridge, MA, USA.

出版信息

AMIA Annu Symp Proc. 2018 Dec 5;2018:616-623. eCollection 2018.

PMID:30815103
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6371299/
Abstract

As the cost of DNA sequencing continues to fall, an increasing amount of information on human genetic variation is being produced that could help progress precision medicine. However, information about such mutations is typically first made available in the scientific literature, and is then later manually curated into more standardized genomic databases. This curation process is expensive, time-consuming and many variants do not end up being fully curated, if at all. Detecting mutations in the literature is the first key step towards automating this process. However, most of the current methods have focused on identifying mutations that follow existing nomenclatures. In this work, we show that there is a large number of mutations that are missed by using this standard approach. Furthermore, we implement the first mutation annotator to cover an extended mutation landscape, and we show that its F1 performance is the same performance as human annotation (F1 78.29 for manual annotation vs F1 79.56 for automatic annotation).

摘要

随着DNA测序成本持续下降,越来越多关于人类基因变异的信息被产出,这有助于推进精准医学的发展。然而,此类突变信息通常首先在科学文献中公布,随后再人工整理到更标准化的基因组数据库中。这个整理过程既昂贵又耗时,而且许多变异最终根本没有得到充分整理。在文献中检测突变是实现这一过程自动化的首要关键步骤。然而,目前大多数方法都集中在识别遵循现有命名法的突变上。在这项工作中,我们表明使用这种标准方法会遗漏大量突变。此外,我们实现了首个覆盖扩展突变图谱的突变注释器,并表明其F1性能与人工注释相同(人工注释的F1为78.29,自动注释的F1为79.56)。