• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从生物医学文献中自动提取和语义分析突变影响。

Automated extraction and semantic analysis of mutation impacts from the biomedical literature.

机构信息

Semantic Software Lab, Department of Computer Science and Software Engineering, Concordia University, Montréal, Québec, Canada.

出版信息

BMC Genomics. 2012 Jun 18;13 Suppl 4(Suppl 4):S10. doi: 10.1186/1471-2164-13-S4-S10.

DOI:10.1186/1471-2164-13-S4-S10
PMID:22759648
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3395893/
Abstract

BACKGROUND

Mutations as sources of evolution have long been the focus of attention in the biomedical literature. Accessing the mutational information and their impacts on protein properties facilitates research in various domains, such as enzymology and pharmacology. However, manually curating the rich and fast growing repository of biomedical literature is expensive and time-consuming. As a solution, text mining approaches have increasingly been deployed in the biomedical domain. While the detection of single-point mutations is well covered by existing systems, challenges still exist in grounding impacts to their respective mutations and recognizing the affected protein properties, in particular kinetic and stability properties together with physical quantities.

RESULTS

We present an ontology model for mutation impacts, together with a comprehensive text mining system for extracting and analysing mutation impact information from full-text articles. Organisms, as sources of proteins, are extracted to help disambiguation of genes and proteins. Our system then detects mutation series to correctly ground detected impacts using novel heuristics. It also extracts the affected protein properties, in particular kinetic and stability properties, as well as the magnitude of the effects and validates these relations against the domain ontology. The output of our system can be provided in various formats, in particular by populating an OWL-DL ontology, which can then be queried to provide structured information. The performance of the system is evaluated on our manually annotated corpora. In the impact detection task, our system achieves a precision of 70.4%-71.1%, a recall of 71.3%-71.5%, and grounds the detected impacts with an accuracy of 76.5%-77%. The developed system, including resources, evaluation data and end-user and developer documentation is freely available under an open source license at http://www.semanticsoftware.info/open-mutation-miner.

CONCLUSION

We present Open Mutation Miner (OMM), the first comprehensive, fully open-source approach to automatically extract impacts and related relevant information from the biomedical literature. We assessed the performance of our work on manually annotated corpora and the results show the reliability of our approach. The representation of the extracted information into a structured format facilitates knowledge management and aids in database curation and correction. Furthermore, access to the analysis results is provided through multiple interfaces, including web services for automated data integration and desktop-based solutions for end user interactions.

摘要

背景

突变作为进化的源头,长期以来一直是生物医学文献关注的焦点。获取突变信息及其对蛋白质性质的影响,有助于酶学和药理学等各个领域的研究。然而,人工整理生物医学文献这一丰富且快速增长的知识库既昂贵又耗时。作为一种解决方案,文本挖掘方法越来越多地应用于生物医学领域。虽然现有系统已经很好地检测到单点突变,但在将影响定位到各自的突变以及识别受影响的蛋白质性质方面仍存在挑战,特别是动力学和稳定性性质以及物理量。

结果

我们提出了一个突变影响的本体模型,以及一个全面的文本挖掘系统,用于从全文文章中提取和分析突变影响信息。生物体作为蛋白质的来源被提取出来,以帮助基因和蛋白质的歧义消解。我们的系统然后使用新的启发式方法检测突变系列,以正确地定位检测到的影响。它还提取受影响的蛋白质性质,特别是动力学和稳定性性质,以及影响的幅度,并根据域本体验证这些关系。我们系统的输出可以以各种格式提供,特别是通过填充 OWL-DL 本体,然后可以查询该本体以提供结构化信息。我们的系统在我们手动注释的语料库上进行了评估。在影响检测任务中,我们的系统达到了 70.4%-71.1%的精度、71.3%-71.5%的召回率和 76.5%-77%的准确性,用于定位检测到的影响。开发的系统包括资源、评估数据以及面向终端用户和开发人员的文档,根据开源许可证可在 http://www.semanticsoftware.info/open-mutation-miner 上免费获得。

结论

我们提出了 Open Mutation Miner (OMM),这是第一个全面的、完全开源的方法,用于从生物医学文献中自动提取影响和相关信息。我们在手动注释语料库上评估了我们的工作性能,结果表明我们的方法是可靠的。将提取信息表示为结构化格式有助于知识管理,并有助于数据库的整理和纠正。此外,通过多个接口提供对分析结果的访问,包括用于自动数据集成的 Web 服务和用于终端用户交互的桌面解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/80de64898860/1471-2164-13-S4-S10-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/aa3cc585a5d1/1471-2164-13-S4-S10-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/ea8daeded60d/1471-2164-13-S4-S10-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/00adcb85efe5/1471-2164-13-S4-S10-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/98b44bab0375/1471-2164-13-S4-S10-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/63966897367c/1471-2164-13-S4-S10-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/6074718f033f/1471-2164-13-S4-S10-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/7636a727f22e/1471-2164-13-S4-S10-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/1d35c9ab36f6/1471-2164-13-S4-S10-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/80de64898860/1471-2164-13-S4-S10-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/aa3cc585a5d1/1471-2164-13-S4-S10-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/ea8daeded60d/1471-2164-13-S4-S10-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/00adcb85efe5/1471-2164-13-S4-S10-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/98b44bab0375/1471-2164-13-S4-S10-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/63966897367c/1471-2164-13-S4-S10-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/6074718f033f/1471-2164-13-S4-S10-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/7636a727f22e/1471-2164-13-S4-S10-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/1d35c9ab36f6/1471-2164-13-S4-S10-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3e8/3395893/80de64898860/1471-2164-13-S4-S10-9.jpg

相似文献

1
Automated extraction and semantic analysis of mutation impacts from the biomedical literature.从生物医学文献中自动提取和语义分析突变影响。
BMC Genomics. 2012 Jun 18;13 Suppl 4(Suppl 4):S10. doi: 10.1186/1471-2164-13-S4-S10.
2
OrganismTagger: detection, normalization and grounding of organism entities in biomedical documents.生物标记器:在生物医学文献中检测、规范和定位生物实体。
Bioinformatics. 2011 Oct 1;27(19):2721-9. doi: 10.1093/bioinformatics/btr452. Epub 2011 Aug 9.
3
Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine.BioCreative VI 精准医学赛道概述:精准医学中的蛋白质相互作用和突变挖掘。
Database (Oxford). 2019 Jan 1;2019:bay147. doi: 10.1093/database/bay147.
4
Algorithms and semantic infrastructure for mutation impact extraction and grounding.突变影响提取和基础的算法和语义基础架构。
BMC Genomics. 2010 Dec 2;11 Suppl 4(Suppl 4):S24. doi: 10.1186/1471-2164-11-S4-S24.
5
Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature.文本挖掘有助于数据库管理——从生物医学文献中提取突变与疾病的关联。
BMC Bioinformatics. 2015 Jun 6;16:185. doi: 10.1186/s12859-015-0609-x.
6
Text mining for precision medicine: automating disease-mutation relationship extraction from biomedical literature.精准医学的文本挖掘:从生物医学文献中自动提取疾病-突变关系
J Am Med Inform Assoc. 2016 Jul;23(4):766-72. doi: 10.1093/jamia/ocw041. Epub 2016 Apr 27.
7
Egas: a collaborative and interactive document curation platform.Egas:一个协作式交互式文档管理平台。
Database (Oxford). 2014 Jun 11;2014. doi: 10.1093/database/bau048. Print 2014.
8
RLIMS-P: an online text-mining tool for literature-based extraction of protein phosphorylation information.RLIMS-P:一种基于文献提取蛋白质磷酸化信息的在线文本挖掘工具。
Database (Oxford). 2014 Aug 13;2014. doi: 10.1093/database/bau081. Print 2014.
9
FamPlex: a resource for entity recognition and relationship resolution of human protein families and complexes in biomedical text mining.FamPlex:生物医学文本挖掘中人类蛋白质家族和复合物的实体识别和关系解析资源。
BMC Bioinformatics. 2018 Jun 28;19(1):248. doi: 10.1186/s12859-018-2211-5.
10
miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases.miRiaD:一种用于检测微小RNA与疾病关联的文本挖掘工具。
J Biomed Semantics. 2016 Apr 29;7(1):9. doi: 10.1186/s13326-015-0044-y.

引用本文的文献

1
Understanding the genetics of viral drug resistance by integrating clinical data and mining of the scientific literature.通过整合临床数据和挖掘科学文献来理解病毒耐药性的遗传学。
Sci Rep. 2022 Aug 25;12(1):14476. doi: 10.1038/s41598-022-17746-3.
2
ResidueFinder: extracting individual residue mentions from protein literature.ResidueFinder:从蛋白质文献中提取单个残基的提及。
J Biomed Semantics. 2021 Jul 21;12(1):14. doi: 10.1186/s13326-021-00243-3.
3
FireProtDB: database of manually curated protein stability data.FireProtDB:人工 curated 蛋白质稳定性数据数据库。

本文引用的文献

1
OrganismTagger: detection, normalization and grounding of organism entities in biomedical documents.生物标记器:在生物医学文献中检测、规范和定位生物实体。
Bioinformatics. 2011 Oct 1;27(19):2721-9. doi: 10.1093/bioinformatics/btr452. Epub 2011 Aug 9.
2
Predicting the functional impact of protein mutations: application to cancer genomics.预测蛋白质突变的功能影响:在癌症基因组学中的应用。
Nucleic Acids Res. 2011 Sep 1;39(17):e118. doi: 10.1093/nar/gkr407. Epub 2011 Jul 3.
3
Algorithms and semantic infrastructure for mutation impact extraction and grounding.
Nucleic Acids Res. 2021 Jan 8;49(D1):D319-D324. doi: 10.1093/nar/gkaa981.
4
Recent advances of automated methods for searching and extracting genomic variant information from biomedical literature.自动化方法在从生物医学文献中搜索和提取基因组变异信息方面的最新进展。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa142.
5
Unique insights from ClinicalTrials.gov by mining protein mutations and RSids in addition to applying the Human Phenotype Ontology.通过挖掘蛋白质突变和 RSids 并应用人类表型本体,从 ClinicalTrials.gov 获得独特的见解。
PLoS One. 2020 May 27;15(5):e0233438. doi: 10.1371/journal.pone.0233438. eCollection 2020.
6
The SNPcurator: literature mining of enriched SNP-disease associations.SNPcurator:富集 SNP-疾病关联的文献挖掘。
Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay020.
7
tmVar 2.0: integrating genomic variant information from literature with dbSNP and ClinVar for precision medicine.tmVar 2.0:整合文献中的基因组变异信息与 dbSNP 和 ClinVar,以用于精准医学。
Bioinformatics. 2018 Jan 1;34(1):80-87. doi: 10.1093/bioinformatics/btx541.
8
SNPPhenA: a corpus for extracting ranked associations of single-nucleotide polymorphisms and phenotypes from literature.SNPPhenA:一个用于从文献中提取单核苷酸多态性与表型的排序关联的语料库。
J Biomed Semantics. 2017 Apr 7;8(1):14. doi: 10.1186/s13326-017-0116-2.
9
Integrating structural and mutagenesis data to elucidate GPCR ligand binding.整合结构和诱变数据以阐明G蛋白偶联受体(GPCR)的配体结合情况。
Curr Opin Pharmacol. 2016 Oct;30:51-58. doi: 10.1016/j.coph.2016.07.003. Epub 2016 Jul 29.
10
DiMeX: A Text Mining System for Mutation-Disease Association Extraction.DiMeX:一种用于提取突变-疾病关联的文本挖掘系统。
PLoS One. 2016 Apr 13;11(4):e0152725. doi: 10.1371/journal.pone.0152725. eCollection 2016.
突变影响提取和基础的算法和语义基础架构。
BMC Genomics. 2010 Dec 2;11 Suppl 4(Suppl 4):S24. doi: 10.1186/1471-2164-11-S4-S24.
4
KID--an algorithm for fast and efficient text mining used to automatically generate a database containing kinetic information of enzymes.KID 是一种快速高效的文本挖掘算法,用于自动生成包含酶动力学信息的数据库。
BMC Bioinformatics. 2010 Jul 13;11:375. doi: 10.1186/1471-2105-11-375.
5
From SNPs to pathways: integration of functional effect of sequence variations on models of cell signalling pathways.从单核苷酸多态性到信号通路:序列变异功能效应在细胞信号通路模型中的整合
BMC Bioinformatics. 2009 Aug 27;10 Suppl 8(Suppl 8):S6. doi: 10.1186/1471-2105-10-S8-S6.
6
EnzyMiner: automatic identification of protein level mutations and their impact on target enzymes from PubMed abstracts.EnzyMiner:从PubMed摘要中自动识别蛋白质水平突变及其对靶酶的影响。
BMC Bioinformatics. 2009 Aug 27;10 Suppl 8(Suppl 8):S2. doi: 10.1186/1471-2105-10-S8-S2.
7
KiPar, a tool for systematic information retrieval regarding parameters for kinetic modelling of yeast metabolic pathways.KiPar,一种用于系统检索酵母代谢途径动力学建模参数相关信息的工具。
Bioinformatics. 2009 Jun 1;25(11):1404-11. doi: 10.1093/bioinformatics/btp175. Epub 2009 Mar 31.
8
Towards a systematic evaluation of protein mutation extraction systems.迈向蛋白质突变提取系统的系统评估。
J Bioinform Comput Biol. 2007 Dec;5(6):1339-59. doi: 10.1142/s0219720007003193.
9
A workflow for mutation extraction and structure annotation.一种用于突变提取和结构注释的工作流程。
J Bioinform Comput Biol. 2007 Dec;5(6):1319-37. doi: 10.1142/s0219720007003119.
10
Application of automatic mutation-gene pair extraction to diseases.自动突变-基因对提取在疾病中的应用。
J Bioinform Comput Biol. 2007 Dec;5(6):1261-75. doi: 10.1142/s021972000700317x.