• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于涵摄的子术语推理框架来审计基因本体论。

SSIF: Subsumption-based Sub-term Inference Framework to audit Gene Ontology.

机构信息

School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Department of Computer Science.

出版信息

Bioinformatics. 2020 May 1;36(10):3207-3214. doi: 10.1093/bioinformatics/btaa106.

DOI:10.1093/bioinformatics/btaa106
PMID:32065617
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7214018/
Abstract

MOTIVATION

The Gene Ontology (GO) is the unifying biological vocabulary for codifying, managing and sharing biological knowledge. Quality issues in GO, if not addressed, can cause misleading results or missed biological discoveries. Manual identification of potential quality issues in GO is a challenging and arduous task, given its growing size. We introduce an automated auditing approach for suggesting potentially missing is-a relations, which may further reveal erroneous is-a relations.

RESULTS

We developed a Subsumption-based Sub-term Inference Framework (SSIF) by leveraging a novel term-algebra on top of a sequence-based representation of GO concepts along with three conditional rules (monotonicity, intersection and sub-concept rules). Applying SSIF to the October 3, 2018 release of GO suggested 1938 unique potentially missing is-a relations. Domain experts evaluated a random sample of 210 potentially missing is-a relations. The results showed SSIF achieved a precision of 60.61, 60.49 and 46.03% for the monotonicity, intersection and sub-concept rules, respectively.

AVAILABILITY AND IMPLEMENTATION

SSIF is implemented in Java. The source code is available at https://github.com/rashmie/SSIF.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

基因本体论 (GO) 是用于编码、管理和共享生物学知识的统一生物学词汇。如果不解决 GO 中的质量问题,可能会导致误导性的结果或错过生物学发现。鉴于其不断增长的规模,手动识别 GO 中的潜在质量问题是一项具有挑战性和艰巨的任务。我们引入了一种自动化的审核方法,用于建议潜在缺失的“is-a”关系,这可能进一步揭示错误的“is-a”关系。

结果

我们通过利用基于序列的 GO 概念表示形式以及三个条件规则(单调性、交集和子概念规则)之上的新术语代数,开发了基于包含的子术语推断框架 (SSIF)。将 SSIF 应用于 2018 年 10 月 3 日发布的 GO 版本,建议了 1938 个独特的潜在缺失的“is-a”关系。领域专家评估了 210 个潜在缺失的“is-a”关系的随机样本。结果表明,SSIF 在单调性、交集和子概念规则方面的精度分别为 60.61%、60.49%和 46.03%。

可用性和实现

SSIF 是用 Java 实现的。源代码可在 https://github.com/rashmie/SSIF 上获得。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7af5/7214018/70b2819928b1/btaa106f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7af5/7214018/a52f33e8e9d5/btaa106f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7af5/7214018/9718a0d690fe/btaa106f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7af5/7214018/5d33ea18e3f6/btaa106f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7af5/7214018/70b2819928b1/btaa106f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7af5/7214018/a52f33e8e9d5/btaa106f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7af5/7214018/9718a0d690fe/btaa106f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7af5/7214018/5d33ea18e3f6/btaa106f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7af5/7214018/70b2819928b1/btaa106f4.jpg

相似文献

1
SSIF: Subsumption-based Sub-term Inference Framework to audit Gene Ontology.基于涵摄的子术语推理框架来审计基因本体论。
Bioinformatics. 2020 May 1;36(10):3207-3214. doi: 10.1093/bioinformatics/btaa106.
2
An evidence-based lexical pattern approach for quality assurance of Gene Ontology relations.基于证据的词汇模式方法,用于保证基因本体论关系的质量。
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac122.
3
Self-prediction of relations in GO facilitates its quality auditing.GO 关系的自预测有助于其质量审核。
J Biomed Inform. 2023 Aug;144:104441. doi: 10.1016/j.jbi.2023.104441. Epub 2023 Jul 10.
4
Onto2Vec: joint vector-based representation of biological entities and their ontology-based annotations.Onto2Vec:基于向量的生物实体联合表示及其基于本体论的标注。
Bioinformatics. 2018 Jul 1;34(13):i52-i60. doi: 10.1093/bioinformatics/bty259.
5
Gene ontology concept recognition using named concept: understanding the various presentations of the gene functions in biomedical literature.使用命名概念进行基因本体论概念识别:理解生物医学文献中基因功能的各种表现形式。
Database (Oxford). 2018 Jan 1;2018:bay115. doi: 10.1093/database/bay115.
6
Improving protein function prediction using protein sequence and GO-term similarities.利用蛋白质序列和 GO 术语相似性提高蛋白质功能预测。
Bioinformatics. 2019 Apr 1;35(7):1116-1124. doi: 10.1093/bioinformatics/bty751.
7
Identification of missing hierarchical relations in the vaccine ontology using acquired term pairs.利用获取的术语对识别疫苗本体中缺失的层次关系。
J Biomed Semantics. 2022 Aug 13;13(1):22. doi: 10.1186/s13326-022-00276-2.
8
A Comparison of Exhaustive and Non-lattice-based Methods for Auditing Hierarchical Relations in Gene Ontology.一种比较穷尽法和非格网法在基因本体论的层次关系审核中的应用。
AMIA Annu Symp Proc. 2022 Feb 21;2021:177-186. eCollection 2021.
9
DiNGO: standalone application for Gene Ontology and Human Phenotype Ontology term enrichment analysis.DiNGO:用于基因本体论和人类表型本体论术语富集分析的独立应用程序。
Bioinformatics. 2019 Nov 8. doi: 10.1093/bioinformatics/btz836.
10
Extending gene ontology with gene association networks.利用基因关联网络扩展基因本体。
Bioinformatics. 2016 Apr 15;32(8):1185-94. doi: 10.1093/bioinformatics/btv712. Epub 2015 Dec 7.

引用本文的文献

1
An evidence-based lexical pattern approach for quality assurance of Gene Ontology relations.基于证据的词汇模式方法,用于保证基因本体论关系的质量。
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac122.
2
A Comparison of Exhaustive and Non-lattice-based Methods for Auditing Hierarchical Relations in Gene Ontology.一种比较穷尽法和非格网法在基因本体论的层次关系审核中的应用。
AMIA Annu Symp Proc. 2022 Feb 21;2021:177-186. eCollection 2021.
3
A lexical-based approach for exhaustive detection of missing hierarchical IS-A relations in SNOMED CT.

本文引用的文献

1
The Gene Ontology Resource: 20 years and still GOing strong.《基因本体论资源:20 年,持续强大》
Nucleic Acids Res. 2019 Jan 8;47(D1):D330-D338. doi: 10.1093/nar/gky1055.
2
Quality assurance of biomedical terminologies and ontologies.生物医学术语和本体的质量保证。
J Biomed Inform. 2018 Oct;86:106-108. doi: 10.1016/j.jbi.2018.09.006. Epub 2018 Sep 8.
3
Can SNOMED CT Changes Be Used as a Surrogate Standard for Evaluating the Performance of Its Auditing Methods?SNOMED CT的变更能否用作评估其审核方法性能的替代标准?
基于词汇的方法,用于全面检测 SNOMED CT 中缺失的层次 IS-A 关系。
AMIA Annu Symp Proc. 2021 Jan 25;2020:1392-1401. eCollection 2020.
AMIA Annu Symp Proc. 2018 Apr 16;2017:1903-1912. eCollection 2017.
4
Auditing SNOMED CT hierarchical relations based on lexical features of concepts in non-lattice subgraphs.基于非格子网中概念的词汇特征来审核 SNOMED CT 层次关系。
J Biomed Inform. 2018 Feb;78:177-184. doi: 10.1016/j.jbi.2017.12.010. Epub 2017 Dec 20.
5
Mining non-lattice subgraphs for detecting missing hierarchical relations and concepts in SNOMED CT.挖掘非格状子图以检测SNOMED CT中缺失的层次关系和概念。
J Am Med Inform Assoc. 2017 Jul 1;24(4):788-798. doi: 10.1093/jamia/ocw175.
6
FEDRR: fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies.FEDRR:用于大型生物医学本体质量改进的冗余层次关系快速详尽检测
BioData Min. 2016 Oct 10;9:31. doi: 10.1186/s13040-016-0110-8. eCollection 2016.
7
Quality assurance of the gene ontology using abstraction networks.使用抽象网络对基因本体进行质量保证。
J Bioinform Comput Biol. 2016 Jun;14(3):1642001. doi: 10.1142/S0219720016420014. Epub 2015 Nov 24.
8
Extending gene ontology with gene association networks.利用基因关联网络扩展基因本体。
Bioinformatics. 2016 Apr 15;32(8):1185-94. doi: 10.1093/bioinformatics/btv712. Epub 2015 Dec 7.
9
Identifying redundant and missing relations in the gene ontology.识别基因本体中的冗余和缺失关系。
Stud Health Technol Inform. 2015;210:195-9.
10
Completing the is-a structure in light-weight ontologies.在轻量级本体中完成“是一个”结构。
J Biomed Semantics. 2015 Mar 28;6:12. doi: 10.1186/s13326-015-0002-8. eCollection 2015.