• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

识别生物医学文本中特定基因的变异。

Identifying gene-specific variations in biomedical text.

作者信息

Klinger Roman, Friedrich Christoph M, Mevissen Heinz Theodor, Fluck Juliane, Hofmann-Apitius Martin, Furlong Laura I, Sanz Ferran

机构信息

Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53754 Sankt Augustin, Germany.

出版信息

J Bioinform Comput Biol. 2007 Dec;5(6):1277-96. doi: 10.1142/s0219720007003156.

DOI:10.1142/s0219720007003156
PMID:18172929
Abstract

The influence of genetic variations on diseases or cellular processes is the main focus of many investigations, and results of biomedical studies are often only accessible through scientific publications. Automatic extraction of this information requires recognition of the gene names and the accompanying allelic variant information. In a previous work, the OSIRIS system for the detection of allelic variation in text based on a query expansion approach was communicated. Challenges associated with this system are the relatively low recall for variation mentions and gene name recognition. To tackle this challenge, we integrate the ProMiner system developed for the recognition and normalization of gene and protein names with a conditional random field (CRF)-based recognition of variation terms in biomedical text. Following the newly developed normalization of variation entities, we can link textual entities to Single Nucleotide Polymorphism database (dbSNP) entries. The performance of this novel approach is evaluated, and improved results in comparison to state-of-the-art systems are reported.

摘要

基因变异对疾病或细胞过程的影响是许多研究的主要焦点,生物医学研究的结果通常只能通过科学出版物获取。自动提取这些信息需要识别基因名称和相关的等位基因变异信息。在之前的一项工作中,介绍了基于查询扩展方法的用于检测文本中等位基因变异的OSIRIS系统。与该系统相关的挑战是变异提及和基因名称识别的召回率相对较低。为应对这一挑战,我们将为识别和标准化基因及蛋白质名称而开发的ProMiner系统与基于条件随机场(CRF)的生物医学文本变异术语识别相结合。根据新开发的变异实体标准化方法,我们可以将文本实体链接到单核苷酸多态性数据库(dbSNP)条目。对这种新方法的性能进行了评估,并报告了与现有系统相比有所改进的结果。

相似文献

1
Identifying gene-specific variations in biomedical text.识别生物医学文本中特定基因的变异。
J Bioinform Comput Biol. 2007 Dec;5(6):1277-96. doi: 10.1142/s0219720007003156.
2
OSIRISv1.2: a named entity recognition system for sequence variants of genes in biomedical literature.OSIRISv1.2:一种用于生物医学文献中基因序列变异的命名实体识别系统。
BMC Bioinformatics. 2008 Feb 5;9:84. doi: 10.1186/1471-2105-9-84.
3
Challenges in the association of human single nucleotide polymorphism mentions with unique database identifiers.人类单核苷酸多态性提及与唯一数据库标识符关联的挑战。
BMC Bioinformatics. 2011;12 Suppl 4(Suppl 4):S4. doi: 10.1186/1471-2105-12-S4-S4. Epub 2011 Jul 5.
4
Automated curation of gene name normalization results using the Konstanz information miner.使用康斯坦茨信息挖掘器对基因名称标准化结果进行自动管理。
J Biomed Inform. 2015 Feb;53:58-64. doi: 10.1016/j.jbi.2014.08.016. Epub 2014 Sep 10.
5
Playing biology's name game: identifying protein names in scientific text.玩生物学的命名游戏:识别科学文本中的蛋白质名称。
Pac Symp Biocomput. 2003:403-14.
6
SETH detects and normalizes genetic variants in text.SETH可检测并规范文本中的基因变异。
Bioinformatics. 2016 Sep 15;32(18):2883-5. doi: 10.1093/bioinformatics/btw234. Epub 2016 Jun 2.
7
Evaluation of techniques for increasing recall in a dictionary approach to gene and protein name identification.在用于基因和蛋白质名称识别的字典方法中提高召回率的技术评估。
J Biomed Inform. 2007 Jun;40(3):316-24. doi: 10.1016/j.jbi.2006.09.002. Epub 2006 Sep 24.
8
Building a protein name dictionary from full text: a machine learning term extraction approach.从全文构建蛋白质名称词典:一种机器学习术语提取方法。
BMC Bioinformatics. 2005 Apr 7;6:88. doi: 10.1186/1471-2105-6-88.
9
Comparison of character-level and part of speech features for name recognition in biomedical texts.生物医学文本中用于名称识别的字符级特征与词性特征比较。
J Biomed Inform. 2004 Dec;37(6):423-35. doi: 10.1016/j.jbi.2004.08.008.
10
Text mining in livestock animal science: introducing the potential of text mining to animal sciences.文本挖掘在畜牧动物科学中的应用:介绍文本挖掘在动物科学中的应用潜力。
J Anim Sci. 2012 Oct;90(10):3666-76. doi: 10.2527/jas.2011-4841. Epub 2012 Jun 4.

引用本文的文献

1
The SNPcurator: literature mining of enriched SNP-disease associations.SNPcurator:富集 SNP-疾病关联的文献挖掘。
Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay020.
2
tmVar 2.0: integrating genomic variant information from literature with dbSNP and ClinVar for precision medicine.tmVar 2.0:整合文献中的基因组变异信息与 dbSNP 和 ClinVar,以用于精准医学。
Bioinformatics. 2018 Jan 1;34(1):80-87. doi: 10.1093/bioinformatics/btx541.
3
Challenges in the association of human single nucleotide polymorphism mentions with unique database identifiers.
人类单核苷酸多态性提及与唯一数据库标识符关联的挑战。
BMC Bioinformatics. 2011;12 Suppl 4(Suppl 4):S4. doi: 10.1186/1471-2105-12-S4-S4. Epub 2011 Jul 5.
4
Improved mutation tagging with gene identifiers applied to membrane protein stability prediction.应用基因标识符改进突变标记以用于膜蛋白稳定性预测。
BMC Bioinformatics. 2009 Aug 27;10 Suppl 8(Suppl 8):S3. doi: 10.1186/1471-2105-10-S8-S3.
5
Identification of histone modifications in biomedical text for supporting epigenomic research.在生物医学文本中识别组蛋白修饰以支持表观基因组学研究。
BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S28. doi: 10.1186/1471-2105-10-S1-S28.
6
Detection of IUPAC and IUPAC-like chemical names.检测国际纯粹与应用化学联合会(IUPAC)及类IUPAC化学名称。
Bioinformatics. 2008 Jul 1;24(13):i268-76. doi: 10.1093/bioinformatics/btn181.