• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将生物属性语义注释半自动转换为PASBio注释。

Semi-automatic conversion of BioProp semantic annotation to PASBio annotation.

作者信息

Tsai Richard Tzong-Han, Dai Hong-Jie, Huang Chi-Hsin, Hsu Wen-Lian

机构信息

Department of Computer Science & Engineering, Yuan Ze University, Chung-Li, Taiwan, R.O.C.

出版信息

BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S18. doi: 10.1186/1471-2105-9-S12-S18.

DOI:10.1186/1471-2105-9-S12-S18
PMID:19091017
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2638158/
Abstract

BACKGROUND

Semantic role labeling (SRL) is an important text analysis technique. In SRL, sentences are represented by one or more predicate-argument structures (PAS). Each PAS is composed of a predicate (verb) and several arguments (noun phrases, adverbial phrases, etc.) with different semantic roles, including main arguments (agent or patient) as well as adjunct arguments (time, manner, or location). PropBank is the most widely used PAS corpus and annotation format in the newswire domain. In the biomedical field, however, more detailed and restrictive PAS annotation formats such as PASBio are popular. Unfortunately, due to the lack of an annotated PASBio corpus, no publicly available machine-learning (ML) based SRL systems based on PASBio have been developed. In previous work, we constructed a biomedical corpus based on the PropBank standard called BioProp, on which we developed an ML-based SRL system, BIOSMILE. In this paper, we aim to build a system to convert BIOSMILE's BioProp annotation output to PASBio annotation. Our system consists of BIOSMILE in combination with a BioProp-PASBio rule-based converter, and an additional semi-automatic rule generator.

RESULTS

Our first experiment evaluated our rule-based converter's performance independently from BIOSMILE performance. The converter achieved an F-score of 85.29%. The second experiment evaluated combined system (BIOSMILE + rule-based converter). The system achieved an F-score of 69.08% for PASBio's 29 verbs.

CONCLUSION

Our approach allows PAS conversion between BioProp and PASBio annotation using BIOSMILE alongside our newly developed semi-automatic rule generator and rule-based converter. Our system can match the performance of other state-of-the-art domain-specific ML-based SRL systems and can be easily customized for PASBio application development.

摘要

背景

语义角色标注(SRL)是一种重要的文本分析技术。在SRL中,句子由一个或多个谓词-论元结构(PAS)表示。每个PAS由一个谓词(动词)和几个具有不同语义角色的论元(名词短语、状语短语等)组成,包括主要论元(施事或受事)以及附属论元(时间、方式或地点)。PropBank是新闻领域中使用最广泛的PAS语料库和标注格式。然而,在生物医学领域,更详细和严格的PAS标注格式(如PASBio)很受欢迎。不幸的是,由于缺乏带注释的PASBio语料库,尚未开发出基于PASBio的公开可用的基于机器学习(ML)的SRL系统。在之前的工作中,我们基于PropBank标准构建了一个名为BioProp的生物医学语料库,并在此基础上开发了一个基于ML的SRL系统BIOSMILE。在本文中,我们旨在构建一个系统,将BIOSMILE的BioProp标注输出转换为PASBio标注。我们的系统由BIOSMILE与一个基于BioProp-PASBio规则的转换器以及一个额外的半自动规则生成器组成。

结果

我们的第一个实验独立于BIOSMILE的性能评估了基于规则的转换器的性能。该转换器的F值为85.29%。第二个实验评估了组合系统(BIOSMILE + 基于规则的转换器)。对于PASBio的29个动词,该系统的F值为69.08%。

结论

我们的方法允许使用BIOSMILE以及新开发的半自动规则生成器和基于规则的转换器在BioProp和PASBio标注之间进行PAS转换。我们的系统可以与其他基于ML的最新领域特定SRL系统的性能相匹配,并且可以轻松定制以用于PASBio应用开发。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/4f4fe26098ba/1471-2105-9-S12-S18-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/99b41024fced/1471-2105-9-S12-S18-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/f13f2735c183/1471-2105-9-S12-S18-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/480fe5d491a0/1471-2105-9-S12-S18-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/426d83bb0294/1471-2105-9-S12-S18-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/5866ccfe7c32/1471-2105-9-S12-S18-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/4f4fe26098ba/1471-2105-9-S12-S18-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/99b41024fced/1471-2105-9-S12-S18-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/f13f2735c183/1471-2105-9-S12-S18-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/480fe5d491a0/1471-2105-9-S12-S18-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/426d83bb0294/1471-2105-9-S12-S18-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/5866ccfe7c32/1471-2105-9-S12-S18-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88b6/2638158/4f4fe26098ba/1471-2105-9-S12-S18-6.jpg

相似文献

1
Semi-automatic conversion of BioProp semantic annotation to PASBio annotation.将生物属性语义注释半自动转换为PASBio注释。
BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S18. doi: 10.1186/1471-2105-9-S12-S18.
2
BIOSMILE: a semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features.BIOSMILE:一种用于生物医学动词的语义角色标注系统,它使用带有自动生成模板特征的最大熵模型。
BMC Bioinformatics. 2007 Sep 1;8:325. doi: 10.1186/1471-2105-8-325.
3
A critical review of PASBio's argument structures for biomedical verbs.对PASBio关于生物医学动词的论证结构的批判性综述。
BMC Bioinformatics. 2006 Nov 24;7 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2105-7-S3-S5.
4
Domain adaptation for semantic role labeling of clinical text.临床文本语义角色标注的领域适应
J Am Med Inform Assoc. 2015 Sep;22(5):967-79. doi: 10.1093/jamia/ocu048. Epub 2015 Jun 10.
5
Domain adaptation for semantic role labeling in the biomedical domain.生物医学领域的语义角色标注的领域自适应。
Bioinformatics. 2010 Apr 15;26(8):1098-104. doi: 10.1093/bioinformatics/btq075. Epub 2010 Feb 23.
6
PASBio: predicate-argument structures for event extraction in molecular biology.PASBio:用于分子生物学事件提取的谓词-论元结构
BMC Bioinformatics. 2004 Oct 19;5:155. doi: 10.1186/1471-2105-5-155.
7
A resource-saving collective approach to biomedical semantic role labeling.一种用于生物医学语义角色标注的资源节约型集体方法。
BMC Bioinformatics. 2014 May 27;15:160. doi: 10.1186/1471-2105-15-160.
8
Semantic role labeling for protein transport predicates.蛋白质转运谓词的语义角色标注。
BMC Bioinformatics. 2008 Jun 11;9:277. doi: 10.1186/1471-2105-9-277.
9
Automatic identification and classification of noun argument structures in biomedical literature.生物医学文献中名词论元结构的自动识别与分类。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1639-48. doi: 10.1109/TCBB.2012.111.
10
Towards semantic role labeling & IE in the medical literature.迈向医学文献中的语义角色标注与信息抽取
AMIA Annu Symp Proc. 2005;2005:410-4.

引用本文的文献

1
The extraction of complex relationships and their conversion to biological expression language (BEL) overview of the BioCreative VI (2017) BEL track.复杂关系的提取及其转化为生物表达语言(BEL)——BioCreative VI(2017) BEL 赛道概述。
Database (Oxford). 2019 Jan 1;2019. doi: 10.1093/database/baz084.
2
BelSmile: a biomedical semantic role labeling approach for extracting biological expression language from text.BelSmile:一种用于从文本中提取生物表达语言的生物医学语义角色标注方法。
Database (Oxford). 2016 May 12;2016. doi: 10.1093/database/baw064. Print 2016.
3
Emerging strengths in Asia Pacific bioinformatics.

本文引用的文献

1
BIOSMILE web search: a web application for annotating biomedical entities and relations.BIOSMILE网络搜索:一种用于注释生物医学实体和关系的网络应用程序。
Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W390-8. doi: 10.1093/nar/gkn319. Epub 2008 May 31.
2
BIOSMILE: a semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features.BIOSMILE:一种用于生物医学动词的语义角色标注系统,它使用带有自动生成模板特征的最大熵模型。
BMC Bioinformatics. 2007 Sep 1;8:325. doi: 10.1186/1471-2105-8-325.
3
NERBio: using selected word conjunctions, term normalization, and global patterns to improve biomedical named entity recognition.
亚太地区生物信息学的新兴优势。
BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-9-S12-S1.
NERBio:利用选定的词连接、术语规范化和全局模式来改进生物医学命名实体识别。
BMC Bioinformatics. 2006 Dec 18;7 Suppl 5(Suppl 5):S11. doi: 10.1186/1471-2105-7-S5-S11.
4
A critical review of PASBio's argument structures for biomedical verbs.对PASBio关于生物医学动词的论证结构的批判性综述。
BMC Bioinformatics. 2006 Nov 24;7 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2105-7-S3-S5.
5
Towards semantic role labeling & IE in the medical literature.迈向医学文献中的语义角色标注与信息抽取
AMIA Annu Symp Proc. 2005;2005:410-4.
6
LSAT: learning about alternative transcripts in MEDLINE.LSAT:了解医学在线数据库(MEDLINE)中的可变转录本
Bioinformatics. 2006 Apr 1;22(7):857-65. doi: 10.1093/bioinformatics/btk044. Epub 2006 Jan 12.
7
Extraction of transcript diversity from scientific literature.从科学文献中提取转录本多样性
PLoS Comput Biol. 2005 Jun;1(1):e10. doi: 10.1371/journal.pcbi.0010010. Epub 2005 Jun 24.
8
PASBio: predicate-argument structures for event extraction in molecular biology.PASBio:用于分子生物学事件提取的谓词-论元结构
BMC Bioinformatics. 2004 Oct 19;5:155. doi: 10.1186/1471-2105-5-155.