• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

迈向可重现的相互作用组:基于语义的冗余检测以统一蛋白质-蛋白质相互作用数据库。

Towards a reproducible interactome: semantic-based detection of redundancies to unify protein-protein interaction databases.

作者信息

Melkonian Marc, Juigné Camille, Dameron Olivier, Rabut Gwenaël, Becker Emmanuelle

机构信息

Univ Rennes, Inria, CNRS, IRISA - UMR 6074, F-35000 Rennes, France.

Univ Rennes, CNRS, IGDR - UMR 6290, F-35000 Rennes, France.

出版信息

Bioinformatics. 2022 Mar 4;38(6):1685-1691. doi: 10.1093/bioinformatics/btac013.

DOI:10.1093/bioinformatics/btac013
PMID:35015827
Abstract

MOTIVATION

Information on protein-protein interactions is collected in numerous primary databases with their own curation process. Several meta-databases aggregate primary databases to provide more exhaustive datasets. In addition to exhaustivity, aggregation contributes to reliability by providing an overview of the various studies and detection methods supporting an interaction. However, interactions listed in different primary databases are partly redundant because some publications reporting protein-protein interactions have been curated by multiple primary databases. Mere aggregation can thus introduce a bias if these redundancies are not identified and eliminated. To overcome this bias, meta-databases rely on the Molecular Interaction ontology that describes interaction detection methods, but they do not fully take advantage of the ontology's rich semantics, which leads to systematically overestimating interaction reproducibility.

RESULTS

We propose a precise definition of explicit and implicit redundancy and show that both can be easily detected using Semantic Web technologies. We apply this process to a dataset from the Agile Protein Interactomes DataServer (APID) meta-database and show that while explicit redundancies were detected by the APID aggregation process, about 15% of APID entries are implicitly redundant and should not be taken into account when presenting confidence-related metrics. More than 90% of implicit redundancies result from the aggregation of distinct primary databases, whereas the remaining occurs between entries of a single database. Finally, we build a 'reproducible interactome' with interactions that have been reproduced by multiple methods or publications. The size of the reproducible interactome is drastically impacted by removing redundancies for both yeast (-59%) and human (-56%), and we show that this is largely due to implicit redundancies.

AVAILABILITY AND IMPLEMENTATION

Software, data and results are available at https://gitlab.com/nnet56/reproducible-interactome, https://reproducible-interactome.genouest.org/, Zenodo (https://doi.org/10.5281/zenodo.5595037) and NDEx (https://doi.org/10.18119/N94302 and https://doi.org/10.18119/N97S4D).

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

蛋白质-蛋白质相互作用的信息收集在众多具有各自编目过程的原始数据库中。几个元数据库汇总原始数据库以提供更详尽的数据集。除了详尽性之外,汇总通过提供支持某种相互作用的各种研究和检测方法的概述,有助于提高可靠性。然而,不同原始数据库中列出的相互作用部分是冗余的,因为一些报道蛋白质-蛋白质相互作用的出版物已被多个原始数据库编目。如果这些冗余未被识别和消除,仅仅汇总可能会引入偏差。为了克服这种偏差,元数据库依赖于描述相互作用检测方法的分子相互作用本体,但它们没有充分利用该本体丰富的语义,这导致系统地高估了相互作用的可重复性。

结果

我们提出了显式冗余和隐式冗余的精确定义,并表明使用语义网技术可以轻松检测到这两种冗余。我们将此过程应用于敏捷蛋白质相互作用组数据服务器(APID)元数据库的数据集,结果表明虽然APID汇总过程检测到了显式冗余,但约15%的APID条目是隐式冗余的,在呈现与置信度相关的指标时不应予以考虑。超过90%的隐式冗余来自不同原始数据库的汇总,而其余的则发生在单个数据库的条目之间。最后,我们构建了一个“可重复的相互作用组”,其中包含通过多种方法或出版物重复验证的相互作用。去除冗余后,酵母(-59%)和人类(-56%)的可重复相互作用组的规模受到了极大影响,我们表明这在很大程度上是由于隐式冗余造成的。

可用性和实现

软件、数据和结果可在https://gitlab.com/nnet56/reproducible-interactome、https://reproducible-interactome.genouest.org/、Zenodo(https://doi.org/10.5281/zenodo.5595037)和NDEx(https://doi.org/10.18119/N94302和https://doi.org/10.18119/N97S4D)获取。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
Towards a reproducible interactome: semantic-based detection of redundancies to unify protein-protein interaction databases.迈向可重现的相互作用组:基于语义的冗余检测以统一蛋白质-蛋白质相互作用数据库。
Bioinformatics. 2022 Mar 4;38(6):1685-1691. doi: 10.1093/bioinformatics/btac013.
2
APID interactomes: providing proteome-based interactomes with controlled quality for multiple species and derived networks.APID相互作用组:为多个物种和衍生网络提供具有可控质量的基于蛋白质组的相互作用组。
Nucleic Acids Res. 2016 Jul 8;44(W1):W529-35. doi: 10.1093/nar/gkw363. Epub 2016 Apr 30.
3
APID database: redefining protein-protein interaction experimental evidences and binary interactomes.APID 数据库:重新定义蛋白质-蛋白质相互作用的实验证据和二进制相互作用组。
Database (Oxford). 2019 Jan 1;2019:baz005. doi: 10.1093/database/baz005.
4
STRING-ing together protein complexes: corpus and methods for extracting physical protein interactions from the biomedical literature.从生物医学文献中提取物理蛋白质相互作用的语料库和方法:将蛋白质复合物串联起来。
Bioinformatics. 2024 Sep 2;40(9). doi: 10.1093/bioinformatics/btae552.
5
Protein-protein interaction databases: keeping up with growing interactomes.蛋白质-蛋白质相互作用数据库:紧跟不断增长的相互作用组
Hum Genomics. 2009 Apr;3(3):291-7. doi: 10.1186/1479-7364-3-3-291.
6
APID: Agile Protein Interaction DataAnalyzer.APID:敏捷蛋白质相互作用数据分析器。
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W298-302. doi: 10.1093/nar/gkl128.
7
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
8
A rapid and accurate approach for prediction of interactomes from co-elution data (PrInCE).一种从共洗脱数据预测相互作用组的快速准确方法(PrInCE)。
BMC Bioinformatics. 2017 Oct 23;18(1):457. doi: 10.1186/s12859-017-1865-8.
9
New insights into protein-protein interaction data lead to increased estimates of the S. cerevisiae interactome size.对蛋白质-蛋白质相互作用数据的新见解导致了酿酒酵母相互作用组大小的估计增加。
BMC Bioinformatics. 2010 Dec 21;11:605. doi: 10.1186/1471-2105-11-605.
10
Response to letter to the editor from Dr Rahman Shiri: The challenging topic of suicide across occupational groups.回复拉赫曼·希里博士的来信:职业群体中的自杀这一具有挑战性的话题。
Scand J Work Environ Health. 2018 Jan 1;44(1):108-110. doi: 10.5271/sjweh.3698. Epub 2017 Dec 8.

引用本文的文献

1
State of the interactomes: an evaluation of molecular networks for generating biological insights.相互作用组的现状:对用于产生生物学见解的分子网络的评估。
Mol Syst Biol. 2025 Jan;21(1):1-29. doi: 10.1038/s44320-024-00077-y. Epub 2024 Dec 9.
2
Accurate and sensitive interactome profiling using a quantitative protein-fragment complementation assay.使用定量蛋白质片段互补测定法进行准确且灵敏的互作组谱分析。
Cell Rep Methods. 2024 Oct 21;4(10):100880. doi: 10.1016/j.crmeth.2024.100880.
3
PhyloString: A web server designed to identify, visualize, and evaluate functional relationships between orthologous protein groups across different phylogenetic lineages.
PhyloString:一个旨在识别、可视化和评估不同系统发育谱系中直系同源蛋白组之间功能关系的网络服务器。
PLoS One. 2024 Jan 26;19(1):e0297010. doi: 10.1371/journal.pone.0297010. eCollection 2024.