• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质本体中翻译后修饰蛋白异构体的可扩展文本挖掘辅助管理

Scalable Text Mining Assisted Curation of Post-Translationally Modified Proteoforms in the Protein Ontology.

作者信息

Ross Karen E, Natale Darren A, Arighi Cecilia, Chen Sheng-Chih, Huang Hongzhan, Li Gang, Ren Jia, Wang Michael, Vijay-Shanker K, Wu Cathy H

机构信息

Protein Information Resource, Georgetown University Medical Center, Washington, DC, USA.

Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA.

出版信息

CEUR Workshop Proc. 2016 Aug;1747. Epub 2016 Nov 29.

PMID:28706471
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5504912/
Abstract

The Protein Ontology (PRO) defines protein classes and their interrelationships from the family to the protein form (proteoform) level within and across species. One of the unique contributions of PRO is its representation of post-translationally modified (PTM) proteoforms. However, progress in adding PTM proteoform classes to PRO has been relatively slow due to the extensive manual curation effort required. Here we report an automated pipeline for creation of PTM proteoform classes that leverages two phosphorylation-focused text mining tools (RLIMS-P, which detects mentions of kinases, substrates, and phosphorylation sites, and eFIP, which detects phosphorylation-dependent protein-protein interactions (PPIs)) and our integrated PTM database, iPTMnet. By applying this pipeline, we obtained a set of ~820 substrate-site pairs that are suitable for automated PRO term generation with literature-based evidence attribution. Inclusion of these terms in PRO will increase PRO coverage of species-specific PTM proteoforms by 50%. Many of these new proteoforms also have associated kinase and/or PPI information. Finally, we show a phosphorylation network for the human and mouse peptidyl-prolyl cis-trans isomerase (PIN1/Pin1) derived from our dataset that demonstrates the biological complexity of the information we have extracted. Our approach addresses scalability in PRO curation and will be further expanded to advance PRO representation of phosphorylated proteoforms.

摘要

蛋白质本体(PRO)从家族到物种内和物种间的蛋白质形式(蛋白异构体)水平定义了蛋白质类别及其相互关系。PRO的独特贡献之一是其对翻译后修饰(PTM)蛋白异构体的表示。然而,由于需要大量的人工整理工作,向PRO中添加PTM蛋白异构体类别的进展相对缓慢。在此,我们报告了一个用于创建PTM蛋白异构体类别的自动化流程,该流程利用了两个专注于磷酸化的文本挖掘工具(RLIMS-P,用于检测激酶、底物和磷酸化位点的提及;eFIP,用于检测磷酸化依赖性蛋白质-蛋白质相互作用(PPI))以及我们的综合PTM数据库iPTMnet。通过应用此流程,我们获得了一组约820个底物-位点对,这些对适用于基于文献证据归属自动生成PRO术语。将这些术语纳入PRO将使PRO对物种特异性PTM蛋白异构体的覆盖范围增加50%。这些新的蛋白异构体中的许多还具有相关的激酶和/或PPI信息。最后,我们展示了一个源自我们数据集的人类和小鼠肽脯氨酰顺反异构酶(PIN1/Pin1)的磷酸化网络,该网络展示了我们所提取信息的生物学复杂性。我们的方法解决了PRO整理中的可扩展性问题,并将进一步扩展以推进磷酸化蛋白异构体的PRO表示。

相似文献

1
Scalable Text Mining Assisted Curation of Post-Translationally Modified Proteoforms in the Protein Ontology.蛋白质本体中翻译后修饰蛋白异构体的可扩展文本挖掘辅助管理
CEUR Workshop Proc. 2016 Aug;1747. Epub 2016 Nov 29.
2
iPTMnet: an integrated resource for protein post-translational modification network discovery.iPTMnet:一个用于蛋白质翻译后修饰网络发现的综合资源。
Nucleic Acids Res. 2018 Jan 4;46(D1):D542-D550. doi: 10.1093/nar/gkx1104.
3
The eFIP system for text mining of protein interaction networks of phosphorylated proteins.基于磷酸化蛋白质相互作用网络的文本挖掘的 eFIP 系统。
Database (Oxford). 2012 Dec 5;2012:bas044. doi: 10.1093/database/bas044. Print 2012.
4
Analysis of Protein Phosphorylation and Its Functional Impact on Protein-Protein Interactions via Text Mining of the Scientific Literature.通过科学文献的文本挖掘分析蛋白质磷酸化及其对蛋白质-蛋白质相互作用的功能影响。
Methods Mol Biol. 2017;1558:213-232. doi: 10.1007/978-1-4939-6783-4_10.
5
Construction of protein phosphorylation networks by data mining, text mining and ontology integration: analysis of the spindle checkpoint.通过数据挖掘、文本挖掘和本体集成构建蛋白质磷酸化网络:纺锤体检查点分析。
Database (Oxford). 2013 Jun 7;2013:bat038. doi: 10.1093/database/bat038. Print 2013.
6
Construction of phosphorylation interaction networks by text mining of full-length articles using the eFIP system.使用eFIP系统通过对全文进行文本挖掘构建磷酸化相互作用网络。
Database (Oxford). 2015 Mar 31;2015. doi: 10.1093/database/bav020. Print 2015.
7
iPTMnet: Integrative Bioinformatics for Studying PTM Networks.iPTMnet:用于研究蛋白质翻译后修饰网络的整合生物信息学
Methods Mol Biol. 2017;1558:333-353. doi: 10.1007/978-1-4939-6783-4_16.
8
RLIMS-P: an online text-mining tool for literature-based extraction of protein phosphorylation information.RLIMS-P:一种基于文献提取蛋白质磷酸化信息的在线文本挖掘工具。
Database (Oxford). 2014 Aug 13;2014. doi: 10.1093/database/bau081. Print 2014.
9
Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT.利用远距离监督和置信度校准的 BioBERT 进行大规模蛋白质 - 蛋白质翻译后修饰提取。
BMC Bioinformatics. 2022 Jan 4;23(1):4. doi: 10.1186/s12859-021-04504-x.
10
Role of phosphorylation in determining the backbone dynamics of the serine/threonine-proline motif and Pin1 substrate recognition.磷酸化在决定丝氨酸/苏氨酸-脯氨酸基序的主链动力学及Pin1底物识别中的作用。
Biochemistry. 1998 Apr 21;37(16):5566-75. doi: 10.1021/bi973060z.

引用本文的文献

1
GlycoSiteMiner: an ML/AI-assisted literature mining-based pipeline for extracting glycosylation sites from PubMed abstracts.糖基位点挖掘工具(GlycoSiteMiner):一种基于机器学习/人工智能辅助文献挖掘的流程,用于从PubMed摘要中提取糖基化位点。
Glycobiology. 2025 Jun 2;35(7). doi: 10.1093/glycob/cwaf030.

本文引用的文献

1
iPTMnet: Integrative Bioinformatics for Studying PTM Networks.iPTMnet:用于研究蛋白质翻译后修饰网络的整合生物信息学
Methods Mol Biol. 2017;1558:333-353. doi: 10.1007/978-1-4939-6783-4_16.
2
The Reactome pathway Knowledgebase.Reactome通路知识库。
Nucleic Acids Res. 2016 Jan 4;44(D1):D481-7. doi: 10.1093/nar/gkv1351. Epub 2015 Dec 9.
3
RLIMS-P 2.0: A Generalizable Rule-Based Information Extraction System for Literature Mining of Protein Phosphorylation Information.RLIMS-P 2.0:一种用于蛋白质磷酸化信息文献挖掘的可通用的基于规则的信息提取系统。
IEEE/ACM Trans Comput Biol Bioinform. 2015 Jan-Feb;12(1):17-29. doi: 10.1109/TCBB.2014.2372765.
4
Construction of phosphorylation interaction networks by text mining of full-length articles using the eFIP system.使用eFIP系统通过对全文进行文本挖掘构建磷酸化相互作用网络。
Database (Oxford). 2015 Mar 31;2015. doi: 10.1093/database/bav020. Print 2015.
5
PhosphoSitePlus, 2014: mutations, PTMs and recalibrations.磷酸化位点Plus,2014:突变、翻译后修饰与重新校准。
Nucleic Acids Res. 2015 Jan;43(Database issue):D512-20. doi: 10.1093/nar/gku1267. Epub 2014 Dec 16.
6
The first pilot project of the consortium for top-down proteomics: a status report.联盟自上而下蛋白质组学的首个试点项目:现状报告。
Proteomics. 2014 May;14(10):1130-40. doi: 10.1002/pmic.201300438. Epub 2014 Apr 14.
7
Protein Ontology: a controlled structured network of protein entities.蛋白质本体论:一个受控的蛋白质实体结构化网络。
Nucleic Acids Res. 2014 Jan;42(Database issue):D415-21. doi: 10.1093/nar/gkt1173. Epub 2013 Nov 21.
8
Construction of protein phosphorylation networks by data mining, text mining and ontology integration: analysis of the spindle checkpoint.通过数据挖掘、文本挖掘和本体集成构建蛋白质磷酸化网络:纺锤体检查点分析。
Database (Oxford). 2013 Jun 7;2013:bat038. doi: 10.1093/database/bat038. Print 2013.
9
PubTator: a web-based text mining tool for assisting biocuration.PubTator:一个用于辅助生物注释的基于网络的文本挖掘工具。
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W518-22. doi: 10.1093/nar/gkt441. Epub 2013 May 22.
10
The PhosphoGRID Saccharomyces cerevisiae protein phosphorylation site database: version 2.0 update.PhosphoGRID 酿酒酵母蛋白质磷酸化位点数据库:版本 2.0 更新。
Database (Oxford). 2013 May 13;2013:bat026. doi: 10.1093/database/bat026. Print 2013.