• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质-蛋白质相互作用:基因首字母缩写冗余及阻碍自动化数据整合的当前局限

Protein-Protein Interactions: Gene Acronym Redundancies and Current Limitations Precluding Automated Data Integration.

作者信息

Casado-Vela Juan, Matthiesen Rune, Sellés Susana, Naranjo José Ramón

机构信息

Spanish National Research Council (CSIC) - Spanish National Biotechnology Centre (CNB), Darwin 3, Cantoblanco, 28049 Madrid, Spain.

Institute of Molecular Pathology and Immunology (IPATIMUP), University of Porto, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal.

出版信息

Proteomes. 2013 May 31;1(1):3-24. doi: 10.3390/proteomes1010003.

DOI:10.3390/proteomes1010003
PMID:28250396
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5314489/
Abstract

Understanding protein interaction networks and their dynamic changes is a major challenge in modern biology. Currently, several experimental and approaches allow the screening of protein interactors in a large-scale manner. Therefore, the bulk of information on protein interactions deposited in databases and peer-reviewed published literature is constantly growing. Multiple databases interfaced from user-friendly web tools recently emerged to facilitate the task of protein interaction data retrieval and data integration. Nevertheless, as we evidence in this report, despite the current efforts towards data integration, the quality of the information on protein interactions retrieved by approaches is frequently incomplete and may even list false interactions. Here we point to some obstacles precluding confident data integration, with special emphasis on protein interactions, which include gene acronym redundancies and protein synonyms. Three human proteins (choline kinase, PPIase and uromodulin) and three different web-based data search engines focused on protein interaction data retrieval (PSICQUIC, DASMI and BIPS) were used to explain the potential occurrence of undesired errors that should be considered by researchers in the field. We demonstrate that, despite the recent initiatives towards data standardization, manual curation of protein interaction networks based on literature searches are still required to remove potential false positives. A three-step workflow consisting of: (i) data retrieval from multiple databases, (ii) peer-reviewed literature searches, and (iii) data curation and integration, is proposed as the best strategy to gather updated information on protein interactions. Finally, this strategy was applied to compile information on human DREAM protein interactome, which constitutes liable training datasets that can be used to improve computational predictions.

摘要

理解蛋白质相互作用网络及其动态变化是现代生物学中的一项重大挑战。目前,有几种实验方法允许大规模筛选蛋白质相互作用体。因此,数据库中存储的以及同行评审发表文献中的大量蛋白质相互作用信息正在不断增长。最近出现了多个通过用户友好的网络工具进行接口的数据库,以促进蛋白质相互作用数据检索和数据集成任务。然而,正如我们在本报告中所证明的,尽管目前在进行数据集成方面做出了努力,但通过这些方法检索到的蛋白质相互作用信息质量往往不完整,甚至可能列出错误的相互作用。在这里,我们指出了一些阻碍可靠数据集成的障碍,特别强调了蛋白质相互作用方面的障碍,其中包括基因首字母缩写冗余和蛋白质同义词。使用三种人类蛋白质(胆碱激酶、肽基脯氨酰异构酶和尿调节蛋白)以及三种专注于蛋白质相互作用数据检索的基于网络的数据搜索引擎(PSICQUIC、DASMI和BIPS)来解释该领域研究人员应考虑的潜在不期望错误的发生情况。我们证明,尽管最近有数据标准化的举措,但基于文献检索对蛋白质相互作用网络进行人工整理仍然是消除潜在假阳性所必需的。提出了一个由三步组成的工作流程:(i)从多个数据库检索数据,(ii)进行同行评审文献检索,以及(iii)数据整理和集成,作为收集蛋白质相互作用最新信息的最佳策略。最后,将该策略应用于汇编人类DREAM蛋白质相互作用组的信息,这些信息构成了可靠的训练数据集,可用于改进计算预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8087/5314489/c809bf4b8a75/proteomes-01-00003-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8087/5314489/0bf70b18eb9c/proteomes-01-00003-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8087/5314489/c809bf4b8a75/proteomes-01-00003-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8087/5314489/0bf70b18eb9c/proteomes-01-00003-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8087/5314489/c809bf4b8a75/proteomes-01-00003-g002.jpg

相似文献

1
Protein-Protein Interactions: Gene Acronym Redundancies and Current Limitations Precluding Automated Data Integration.蛋白质-蛋白质相互作用:基因首字母缩写冗余及阻碍自动化数据整合的当前局限
Proteomes. 2013 May 31;1(1):3-24. doi: 10.3390/proteomes1010003.
2
Towards a reproducible interactome: semantic-based detection of redundancies to unify protein-protein interaction databases.迈向可重现的相互作用组:基于语义的冗余检测以统一蛋白质-蛋白质相互作用数据库。
Bioinformatics. 2022 Mar 4;38(6):1685-1691. doi: 10.1093/bioinformatics/btac013.
3
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
4
In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics.使用多个搜索引擎和明确的指标对蛋白质推断算法进行深入分析。
J Proteomics. 2017 Jan 6;150:170-182. doi: 10.1016/j.jprot.2016.08.002. Epub 2016 Aug 4.
5
The MIntAct Project and Molecular Interaction Databases.MIntAct项目与分子相互作用数据库。
Methods Mol Biol. 2016;1415:55-69. doi: 10.1007/978-1-4939-3572-7_3.
6
BBP: Brucella genome annotation with literature mining and curation.BBP:通过文献挖掘与整理进行布鲁氏菌基因组注释
BMC Bioinformatics. 2006 Jul 16;7:347. doi: 10.1186/1471-2105-7-347.
7
UniHI: an entry gate to the human protein interactome.UniHI:人类蛋白质相互作用组的入口
Nucleic Acids Res. 2007 Jan;35(Database issue):D590-4. doi: 10.1093/nar/gkl817. Epub 2006 Dec 7.
8
An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.对生物创意(BioCreAtIvE)和基因本体注释(GOA)的基因本体(GO)注释检索的评估。
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S17. doi: 10.1186/1471-2105-6-S1-S17. Epub 2005 May 24.
9
Overview of the protein-protein interaction annotation extraction task of BioCreative II.生物创意II蛋白质-蛋白质相互作用注释提取任务概述。
Genome Biol. 2008;9 Suppl 2(Suppl 2):S4. doi: 10.1186/gb-2008-9-s2-s4. Epub 2008 Sep 1.
10
Interactome of the hepatitis C virus: Literature mining with ANDSystem.丙型肝炎病毒的相互作用组:使用 ANDSystem 进行文献挖掘。
Virus Res. 2016 Jun 15;218:40-8. doi: 10.1016/j.virusres.2015.12.003. Epub 2015 Dec 7.

本文引用的文献

1
The 2013 Nucleic Acids Research Database Issue and the online molecular biology database collection.2013 年核酸研究数据库问题及在线分子生物学数据库资源集合。
Nucleic Acids Res. 2013 Jan;41(Database issue):D1-7. doi: 10.1093/nar/gks1297. Epub 2012 Nov 30.
2
Genenames.org: the HGNC resources in 2013.Genenames.org:2013 年的 HGNC 资源。
Nucleic Acids Res. 2013 Jan;41(Database issue):D545-52. doi: 10.1093/nar/gks1066. Epub 2012 Nov 17.
3
The potassium channel interacting protein 3 (DREAM/KChIP3) heterodimerizes with and regulates calmodulin function.
钾通道相互作用蛋白 3(DREAM/KChIP3)与钙调蛋白形成异二聚体并调节其功能。
J Biol Chem. 2012 Nov 16;287(47):39439-48. doi: 10.1074/jbc.M112.398495. Epub 2012 Sep 27.
4
Computational prediction of protein-protein complexes.蛋白质-蛋白质复合物的计算预测
BMC Res Notes. 2012 Sep 9;5:495. doi: 10.1186/1756-0500-5-495.
5
BIPS: BIANA Interolog Prediction Server. A tool for protein-protein interaction inference.BIPS:BIANA 互作预测服务器。一个用于蛋白质-蛋白质相互作用推断的工具。
Nucleic Acids Res. 2012 Jul;40(Web Server issue):W147-51. doi: 10.1093/nar/gks553. Epub 2012 Jun 11.
6
Recent advances in protein-protein interaction prediction: experimental and computational methods.蛋白质-蛋白质相互作用预测的最新进展:实验和计算方法。
Expert Opin Drug Discov. 2011 Sep;6(9):921-35. doi: 10.1517/17460441.2011.603722. Epub 2011 Jul 29.
7
Molecular interaction databases.分子相互作用数据库。
Proteomics. 2012 May;12(10):1656-62. doi: 10.1002/pmic.201100484.
8
Uromodulin and α(1)-antitrypsin urinary peptide analysis to differentiate glomerular kidney diseases.尿调蛋白和α(1)-抗胰蛋白酶尿肽分析在肾小球肾脏疾病鉴别诊断中的应用。
Kidney Blood Press Res. 2012;35(5):314-25. doi: 10.1159/000335383. Epub 2012 Mar 7.
9
The KUPKB: a novel Web application to access multiomics data on kidney disease.KUPKB:一种用于访问肾脏疾病多组学数据的新型 Web 应用程序。
FASEB J. 2012 May;26(5):2145-53. doi: 10.1096/fj.11-194381. Epub 2012 Feb 17.
10
Large-scale mapping of human protein interactome using structural complexes.利用结构复合物大规模绘制人类蛋白质相互作用组图谱。
EMBO Rep. 2012 Mar 1;13(3):266-71. doi: 10.1038/embor.2011.261.