• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用多个语义内核进行蛋白质相互作用句子检测。

Protein interaction sentence detection using multiple semantic kernels.

作者信息

Polajnar Tamara, Damoulas Theodoros, Girolami Mark

机构信息

School of Computing Science, University of Glasgow, Glasgow, UK.

出版信息

J Biomed Semantics. 2011 May 14;2(1):1. doi: 10.1186/2041-1480-2-1.

DOI:10.1186/2041-1480-2-1
PMID:21569604
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3116455/
Abstract

BACKGROUND

Detection of sentences that describe protein-protein interactions (PPIs) in biomedical publications is a challenging and unresolved pattern recognition problem. Many state-of-the-art approaches for this task employ kernel classification methods, in particular support vector machines (SVMs). In this work we propose a novel data integration approach that utilises semantic kernels and a kernel classification method that is a probabilistic analogue to SVMs. Semantic kernels are created from statistical information gathered from large amounts of unlabelled text using lexical semantic models. Several semantic kernels are then fused into an overall composite classification space. In this initial study, we use simple features in order to examine whether the use of combinations of kernels constructed using word-based semantic models can improve PPI sentence detection.

RESULTS

We show that combinations of semantic kernels lead to statistically significant improvements in recognition rates and receiver operating characteristic (ROC) scores over the plain Gaussian kernel, when applied to a well-known labelled collection of abstracts. The proposed kernel composition method also allows us to automatically infer the most discriminative kernels.

CONCLUSIONS

The results from this paper indicate that using semantic information from unlabelled text, and combinations of such information, can be valuable for classification of short texts such as PPI sentences. This study, however, is only a first step in evaluation of semantic kernels and probabilistic multiple kernel learning in the context of PPI detection. The method described herein is modular, and can be applied with a variety of feature types, kernels, and semantic models, in order to facilitate full extraction of interacting proteins.

摘要

背景

在生物医学文献中检测描述蛋白质 - 蛋白质相互作用(PPI)的句子是一个具有挑战性且尚未解决的模式识别问题。许多针对此任务的先进方法采用核分类方法,特别是支持向量机(SVM)。在这项工作中,我们提出了一种新颖的数据集成方法,该方法利用语义核以及一种与SVM类似的概率核分类方法。语义核是根据使用词汇语义模型从大量未标记文本中收集的统计信息创建的。然后将几个语义核融合到一个整体的复合分类空间中。在这项初步研究中,我们使用简单特征来检验使用基于词的语义模型构建的核的组合是否可以提高PPI句子检测的效果。

结果

我们表明,当应用于一个著名的带标记摘要集合时,语义核的组合在识别率和接收器操作特征(ROC)分数方面比普通高斯核有统计学上的显著提高。所提出的核组合方法还使我们能够自动推断出最具判别力的核。

结论

本文的结果表明,使用来自未标记文本的语义信息以及此类信息的组合对于诸如PPI句子等短文本的分类可能是有价值的。然而,这项研究只是在PPI检测背景下评估语义核和概率多核学习的第一步。本文描述的方法是模块化的,并且可以与各种特征类型、核和语义模型一起应用,以便于充分提取相互作用的蛋白质。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/678da04b5208/2041-1480-2-1-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/69ff82ab2838/2041-1480-2-1-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/1d46686c8e9d/2041-1480-2-1-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/fec8534634af/2041-1480-2-1-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/8f06c53d2605/2041-1480-2-1-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/cb46ea054f96/2041-1480-2-1-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/6f369e3f7b56/2041-1480-2-1-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/678da04b5208/2041-1480-2-1-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/69ff82ab2838/2041-1480-2-1-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/1d46686c8e9d/2041-1480-2-1-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/fec8534634af/2041-1480-2-1-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/8f06c53d2605/2041-1480-2-1-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/cb46ea054f96/2041-1480-2-1-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/6f369e3f7b56/2041-1480-2-1-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fced/3116455/678da04b5208/2041-1480-2-1-7.jpg

相似文献

1
Protein interaction sentence detection using multiple semantic kernels.使用多个语义内核进行蛋白质相互作用句子检测。
J Biomed Semantics. 2011 May 14;2(1):1. doi: 10.1186/2041-1480-2-1.
2
Exploiting graph kernels for high performance biomedical relation extraction.利用图核进行高性能生物医学关系提取。
J Biomed Semantics. 2018 Jan 30;9(1):7. doi: 10.1186/s13326-017-0168-3.
3
Integrating semantic information into multiple kernels for protein-protein interaction extraction from biomedical literatures.将语义信息整合到多个内核中以从生物医学文献中提取蛋白质-蛋白质相互作用。
PLoS One. 2014 Mar 12;9(3):e91898. doi: 10.1371/journal.pone.0091898. eCollection 2014.
4
Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature.用于从生物医学文献中提取蛋白质-蛋白质相互作用的分布式平滑树核
PLoS One. 2017 Nov 3;12(11):e0187379. doi: 10.1371/journal.pone.0187379. eCollection 2017.
5
Nonlinear Deep Kernel Learning for Image Annotation.用于图像标注的非线性深度核学习
IEEE Trans Image Process. 2017 Apr;26(4):1820-1832. doi: 10.1109/TIP.2017.2666038. Epub 2017 Feb 8.
6
Leveraging syntactic and semantic graph kernels to extract pharmacokinetic drug drug interactions from biomedical literature.利用句法和语义图核从生物医学文献中提取药代动力学药物相互作用。
BMC Syst Biol. 2016 Aug 26;10 Suppl 3(Suppl 3):67. doi: 10.1186/s12918-016-0311-2.
7
A comprehensive benchmark of kernel methods to extract protein-protein interactions from literature.从文献中提取蛋白质-蛋白质相互作用的核方法综合基准测试
PLoS Comput Biol. 2010 Jul 1;6(7):e1000837. doi: 10.1371/journal.pcbi.1000837.
8
Context-dependent kernels for object classification.基于上下文的目标分类核函数。
IEEE Trans Pattern Anal Mach Intell. 2011 Apr;33(4):699-708. doi: 10.1109/TPAMI.2010.198.
9
Efficient classification for additive kernel SVMs.加法核支持向量机的高效分类。
IEEE Trans Pattern Anal Mach Intell. 2013 Jan;35(1):66-77. doi: 10.1109/TPAMI.2012.62.
10
Walk-weighted subsequence kernels for protein-protein interaction extraction.基于行走权重的蛋白质相互作用提取子序列核方法。
BMC Bioinformatics. 2010 Feb 25;11:107. doi: 10.1186/1471-2105-11-107.

引用本文的文献

1
Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts.用于从生物医学文本中发现基因相互作用及其上下文信息的序列模式挖掘
J Biomed Semantics. 2015 May 18;6:27. doi: 10.1186/s13326-015-0023-3. eCollection 2015.
2
Automatic extraction of biomolecular interactions: an empirical approach.生物分子相互作用的自动提取:一种经验方法。
BMC Bioinformatics. 2013 Jul 24;14:234. doi: 10.1186/1471-2105-14-234.
3
Popular computational methods to assess multiprotein complexes derived from label-free affinity purification and mass spectrometry (AP-MS) experiments.

本文引用的文献

1
A comprehensive benchmark of kernel methods to extract protein-protein interactions from literature.从文献中提取蛋白质-蛋白质相互作用的核方法综合基准测试
PLoS Comput Biol. 2010 Jul 1;6(7):e1000837. doi: 10.1371/journal.pcbi.1000837.
2
KEGG for representation and analysis of molecular networks involving diseases and drugs.KEGG 用于表示和分析涉及疾病和药物的分子网络。
Nucleic Acids Res. 2010 Jan;38(Database issue):D355-60. doi: 10.1093/nar/gkp896. Epub 2009 Oct 30.
3
All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning.
用于评估基于无标记亲和纯化和质谱 (AP-MS) 实验的多蛋白复合物的常用计算方法。
Mol Cell Proteomics. 2013 Jan;12(1):1-13. doi: 10.1074/mcp.R112.019554. Epub 2012 Oct 15.
4
Extraction of data deposition statements from the literature: a method for automatically tracking research results.从文献中提取数据提交声明:一种自动跟踪研究结果的方法。
Bioinformatics. 2011 Dec 1;27(23):3306-12. doi: 10.1093/bioinformatics/btr573. Epub 2011 Oct 13.
用于蛋白质-蛋白质相互作用提取的全路径图核以及跨语料库学习评估
BMC Bioinformatics. 2008 Nov 19;9 Suppl 11(Suppl 11):S2. doi: 10.1186/1471-2105-9-S11-S2.
4
Overview of BioCreative II gene mention recognition.生物创意II基因提及识别概述。
Genome Biol. 2008;9 Suppl 2(Suppl 2):S2. doi: 10.1186/gb-2008-9-s2-s2. Epub 2008 Sep 1.
5
Comparative analysis of five protein-protein interaction corpora.五个蛋白质-蛋白质相互作用语料库的比较分析。
BMC Bioinformatics. 2008 Apr 11;9 Suppl 3(Suppl 3):S6. doi: 10.1186/1471-2105-9-S3-S6.
6
Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection.概率多类多核学习:用于蛋白质折叠识别和远程同源性检测
Bioinformatics. 2008 May 15;24(10):1264-70. doi: 10.1093/bioinformatics/btn112. Epub 2008 Mar 31.
7
Assisted curation: does text mining really help?辅助编目:文本挖掘真的有帮助吗?
Pac Symp Biocomput. 2008:556-67.
8
Benchmarking natural-language parsers for biological applications using dependency graphs.使用依存关系图对生物应用中的自然语言解析器进行基准测试。
BMC Bioinformatics. 2007 Jan 25;8:24. doi: 10.1186/1471-2105-8-24.
9
EBIMed--text crunching to gather facts for proteins from Medline.EBIMed——通过文本处理从医学在线数据库中收集蛋白质相关事实。
Bioinformatics. 2007 Jan 15;23(2):e237-44. doi: 10.1093/bioinformatics/btl302.
10
Representing word meaning and order information in a composite holographic lexicon.在复合全息词典中表示词义和顺序信息。
Psychol Rev. 2007 Jan;114(1):1-37. doi: 10.1037/0033-295X.114.1.1.