• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

生物医学文献中的病原体与基因产物标准化

Pathogens and gene product normalization in the biomedical literature.

作者信息

Vishnyakova Dina, Pasche Emilie, Teodoro Douglas, Lovis Christian, Ruch Patrick

机构信息

BiTeM Group.

出版信息

Stud Health Technol Inform. 2012;174:89-93.

PMID:22491118
Abstract

We present a new approach for pathogens and gene product normalization in the biomedical literature. The idea of this approach was motivated by needs such as literature curation, in particular applied to the field of infectious diseases thus, variants of bacterial species (S. aureus, Staphyloccocus aureus, ...) and their gene products (protein ArsC, Arsenical pump modifier, Arsenate reductase, ...). Our approach is based on the use of an Ontology Look-up Service, a Gene Ontology Categorizer (GOCat) and Gene Normalization methods. In the pathogen detection task the use of OLS disambiguates found pathogen names. GOCat results are incorporated into overall score system to support and to confirm the decisionmaking in normalization process of pathogens and their genomes. The evaluation was done on two test sets of BioCreativeIII benchmark: gold standard of manual curation (50 articles) and silver standard (507 articles) curated by collective results of BCIII participants. For the cross-species GN we achieved the precision of 46% for silver and 27% for gold sets. Pathogen normalization results showed 95% of precision and 93% of recall. The impact of GOCat explicitly improves results of pathogen and gene normalization, basically confirming identified pathogens and boosting correct gene identifiers on the top of the results' list ranked by confidence. A correct identification of the pathogen is able to improve significantly normalization effectiveness and to solve the disambiguation problem of genes.

摘要

我们提出了一种用于生物医学文献中病原体和基因产物标准化的新方法。这种方法的理念源于文献编目等需求,尤其适用于传染病领域,因此涉及细菌物种的变体(金黄色葡萄球菌、金黄色酿脓葡萄球菌等)及其基因产物(蛋白质ArsC、砷泵修饰剂、砷酸盐还原酶等)。我们的方法基于本体查找服务、基因本体分类器(GOCat)和基因标准化方法的使用。在病原体检测任务中,OLS的使用消除了所发现病原体名称的歧义。GOCat的结果被纳入总体评分系统,以支持并确认病原体及其基因组标准化过程中的决策。评估是在BioCreativeIII基准的两个测试集上进行的:手动编目的金标准(50篇文章)和由BCIII参与者的集体结果编目的银标准(507篇文章)。对于跨物种基因标准化,银标准集的精确率为46%,金标准集为27%。病原体标准化结果显示精确率为95%,召回率为93%。GOCat的影响显著提高了病原体和基因标准化的结果,基本上确认了已识别的病原体,并在按置信度排序的结果列表顶部提高了正确基因标识符的比例。病原体的正确识别能够显著提高标准化效果,并解决基因的歧义问题。

相似文献

1
Pathogens and gene product normalization in the biomedical literature.生物医学文献中的病原体与基因产物标准化
Stud Health Technol Inform. 2012;174:89-93.
2
Classification and prioritization of biomedical literature for the comparative toxicogenomics database.用于比较毒理基因组学数据库的生物医学文献分类与优先级排序
Stud Health Technol Inform. 2012;180:210-4.
3
Gene name identification and normalization using a model organism database.使用模式生物数据库进行基因名称识别与标准化
J Biomed Inform. 2004 Dec;37(6):396-410. doi: 10.1016/j.jbi.2004.08.010.
4
Terminological resources for text mining over biomedical scientific literature.生物医学文献文本挖掘的术语资源。
Artif Intell Med. 2011 Jun;52(2):107-14. doi: 10.1016/j.artmed.2011.04.011. Epub 2011 Jun 11.
5
BioCreative III interactive task: an overview.BioCreative III 交互式任务概述。
BMC Bioinformatics. 2011 Oct 3;12 Suppl 8(Suppl 8):S4. doi: 10.1186/1471-2105-12-S8-S4.
6
Classification methods for finding articles describing protein-protein interactions in PubMed.在PubMed中查找描述蛋白质-蛋白质相互作用文章的分类方法。
J Integr Bioinform. 2011 Sep 16;8(3):178. doi: 10.2390/biecoll-jib-2011-178.
7
Inter-species normalization of gene mentions with GNAT.使用GNAT对基因提及进行种间标准化。
Bioinformatics. 2008 Aug 15;24(16):i126-132. doi: 10.1093/bioinformatics/btn299.
8
The gene normalization task in BioCreative III.BioCreative III 中的基因标准化任务。
BMC Bioinformatics. 2011 Oct 3;12 Suppl 8(Suppl 8):S2. doi: 10.1186/1471-2105-12-S8-S2.
9
Soft tagging of overlapping high confidence gene mention variants for cross-species full-text gene normalization.重叠高置信度基因提及变体的软标记用于跨物种全文基因归一化。
BMC Bioinformatics. 2011 Oct 3;12 Suppl 8(Suppl 8):S6. doi: 10.1186/1471-2105-12-S8-S6.
10
Cross-species gene normalization by species inference.物种推断的跨物种基因标准化。
BMC Bioinformatics. 2011 Oct 3;12 Suppl 8(Suppl 8):S5. doi: 10.1186/1471-2105-12-S8-S5.