• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

探索基于文献的基因本体论自动标注不一致性检测。

Exploring automatic inconsistency detection for literature-based gene ontology annotation.

机构信息

School of Computing and Information Systems, The University of Melbourne, Parkville, VIC 3010, Australia.

School of Computer Technologies, RMIT University, Melbourne, VIC 3000, Australia.

出版信息

Bioinformatics. 2022 Jun 24;38(Suppl 1):i273-i281. doi: 10.1093/bioinformatics/btac230.

DOI:10.1093/bioinformatics/btac230
PMID:35758780
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9235499/
Abstract

MOTIVATION

Literature-based gene ontology annotations (GOA) are biological database records that use controlled vocabulary to uniformly represent gene function information that is described in the primary literature. Assurance of the quality of GOA is crucial for supporting biological research. However, a range of different kinds of inconsistencies in between literature as evidence and annotated GO terms can be identified; these have not been systematically studied at record level. The existing manual-curation approach to GOA consistency assurance is inefficient and is unable to keep pace with the rate of updates to gene function knowledge. Automatic tools are therefore needed to assist with GOA consistency assurance. This article presents an exploration of different GOA inconsistencies and an early feasibility study of automatic inconsistency detection.

RESULTS

We have created a reliable synthetic dataset to simulate four realistic types of GOA inconsistency in biological databases. Three automatic approaches are proposed. They provide reasonable performance on the task of distinguishing the four types of inconsistency and are directly applicable to detect inconsistencies in real-world GOA database records. Major challenges resulting from such inconsistencies in the context of several specific application settings are reported. This is the first study to introduce automatic approaches that are designed to address the challenges in current GOA quality assurance workflows. The data underlying this article are available in Github at https://github.com/jiyuc/AutoGOAConsistency.

摘要

动机

基于文献的基因本体论注释 (GOA) 是生物数据库记录,使用受控词汇表统一表示文献中描述的基因功能信息。GOA 的质量保证对于支持生物研究至关重要。然而,在文献作为证据和注释的 GO 术语之间,可以识别出一系列不同类型的不一致;这些不一致尚未在记录级别进行系统研究。现有的 GOA 一致性保证手动策管方法效率低下,无法跟上基因功能知识更新的速度。因此,需要自动工具来协助 GOA 一致性保证。本文探讨了不同的 GOA 不一致,并对自动不一致检测进行了早期可行性研究。

结果

我们创建了一个可靠的合成数据集,以模拟生物数据库中四种现实的 GOA 不一致类型。提出了三种自动方法。它们在区分四种不一致类型的任务上提供了合理的性能,并且可以直接应用于检测真实世界的 GOA 数据库记录中的不一致。报告了在几个特定应用场景下,由于这种不一致而产生的主要挑战。这是首次引入旨在解决当前 GOA 质量保证工作流程中的挑战的自动方法的研究。本文所依据的数据可在 Github 上获得,网址为 https://github.com/jiyuc/AutoGOAConsistency。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6297/9235499/77cb91f11258/btac230f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6297/9235499/e580319649e4/btac230f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6297/9235499/4b3e6c16ebdf/btac230f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6297/9235499/43d96c1283a9/btac230f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6297/9235499/a7a6fb85e5ff/btac230f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6297/9235499/77cb91f11258/btac230f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6297/9235499/e580319649e4/btac230f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6297/9235499/4b3e6c16ebdf/btac230f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6297/9235499/43d96c1283a9/btac230f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6297/9235499/a7a6fb85e5ff/btac230f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6297/9235499/77cb91f11258/btac230f5.jpg

相似文献

1
Exploring automatic inconsistency detection for literature-based gene ontology annotation.探索基于文献的基因本体论自动标注不一致性检测。
Bioinformatics. 2022 Jun 24;38(Suppl 1):i273-i281. doi: 10.1093/bioinformatics/btac230.
2
Integration of background knowledge for automatic detection of inconsistencies in gene ontology annotation.背景知识的整合用于自动检测基因本体论注释中的不一致性。
Bioinformatics. 2024 Jun 28;40(Suppl 1):i390-i400. doi: 10.1093/bioinformatics/btae246.
3
Automatic consistency assurance for literature-based gene ontology annotation.基于文献的基因本体论自动一致性保证。
BMC Bioinformatics. 2021 Nov 25;22(1):565. doi: 10.1186/s12859-021-04479-9.
4
The GOA database: gene Ontology annotation updates for 2015.基因本体注释数据库(GOA):2015年基因本体注释更新
Nucleic Acids Res. 2015 Jan;43(Database issue):D1057-63. doi: 10.1093/nar/gku1113. Epub 2014 Nov 6.
5
An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.对生物创意(BioCreAtIvE)和基因本体注释(GOA)的基因本体(GO)注释检索的评估。
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S17. doi: 10.1186/1471-2105-6-S1-S17. Epub 2005 May 24.
6
The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology.基因本体注释(GOA)数据库:在UniProt中与基因本体共享知识。
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D262-6. doi: 10.1093/nar/gkh021.
7
The UniProt-GO Annotation database in 2011.2011 年的 UniProt-GO Annotation 数据库。
Nucleic Acids Res. 2012 Jan;40(Database issue):D565-70. doi: 10.1093/nar/gkr1048. Epub 2011 Nov 28.
8
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
9
Managing the data deluge: data-driven GO category assignment improves while complexity of functional annotation increases.管理数据洪流:数据驱动的 GO 类别分配改进,而功能注释的复杂性增加。
Database (Oxford). 2013 Jul 9;2013:bat041. doi: 10.1093/database/bat041. Print 2013.
10
The GOA database in 2009--an integrated Gene Ontology Annotation resource.2009年的基因本体注释(GOA)数据库——一个整合的基因本体注释资源。
Nucleic Acids Res. 2009 Jan;37(Database issue):D396-403. doi: 10.1093/nar/gkn803. Epub 2008 Oct 27.

引用本文的文献

1
Integration of background knowledge for automatic detection of inconsistencies in gene ontology annotation.背景知识的整合用于自动检测基因本体论注释中的不一致性。
Bioinformatics. 2024 Jun 28;40(Suppl 1):i390-i400. doi: 10.1093/bioinformatics/btae246.

本文引用的文献

1
Automatic consistency assurance for literature-based gene ontology annotation.基于文献的基因本体论自动一致性保证。
BMC Bioinformatics. 2021 Nov 25;22(1):565. doi: 10.1186/s12859-021-04479-9.
2
The Gene Ontology resource: enriching a GOld mine.基因本体论资源:丰富一个 GOld 矿。
Nucleic Acids Res. 2021 Jan 8;49(D1):D325-D334. doi: 10.1093/nar/gkaa1113.
3
UniProt: the universal protein knowledgebase in 2021.UniProt:2021 年的通用蛋白质知识库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489. doi: 10.1093/nar/gkaa1100.
4
Gene Ontology Curation of Neuroinflammation Biology Improves the Interpretation of Alzheimer's Disease Gene Expression Data.神经炎症生物学的基因本体论注释可改善阿尔茨海默病基因表达数据的解读。
J Alzheimers Dis. 2020;75(4):1417-1435. doi: 10.3233/JAD-200207.
5
Gene Ontology Causal Activity Modeling (GO-CAM) moves beyond GO annotations to structured descriptions of biological functions and systems.基因本体论因果活动建模(GO-CAM)超越了 GO 注释,实现了对生物功能和系统的结构化描述。
Nat Genet. 2019 Oct;51(10):1429-1433. doi: 10.1038/s41588-019-0500-1.
6
Mouse Genome Database (MGD) 2019.鼠标基因组数据库 (MGD) 2019.
Nucleic Acids Res. 2019 Jan 8;47(D1):D801-D806. doi: 10.1093/nar/gky1056.
7
Modeling polypharmacy side effects with graph convolutional networks.基于图卷积网络的药物滥用副作用建模。
Bioinformatics. 2018 Jul 1;34(13):i457-i466. doi: 10.1093/bioinformatics/bty294.
8
Textpresso Central: a customizable platform for searching, text mining, viewing, and curating biomedical literature.Textpresso 中心:一个可定制的平台,用于搜索、文本挖掘、查看和管理生物医学文献。
BMC Bioinformatics. 2018 Mar 9;19(1):94. doi: 10.1186/s12859-018-2103-8.
9
BIOSSES: a semantic sentence similarity estimation system for the biomedical domain.BIOSSES:一种用于生物医学领域的语义句子相似度估计系统。
Bioinformatics. 2017 Jul 15;33(14):i49-i58. doi: 10.1093/bioinformatics/btx238.
10
Benchmarks for measurement of duplicate detection methods in nucleotide databases.核苷酸数据库中重复检测方法的测量基准。
Database (Oxford). 2017 Jan 8;2023. doi: 10.1093/database/baw164.