• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PhosBERT:一种用于识别 SARS-CoV-2 感染人类细胞中磷酸化位点的自监督学习模型。

PhosBERT: A self-supervised learning model for identifying phosphorylation sites in SARS-CoV-2-infected human cells.

机构信息

Sichuan Vocational College of Health and Rehabilitation, Zigong 643000, Sichuan, China.

The People's Hospital of Ya 'an, Ya'an 625000, Sichuan, China; The People's Hospital of Wenjiang Chengdu, Chengdu 611130, Sichuan, China.

出版信息

Methods. 2024 Oct;230:140-146. doi: 10.1016/j.ymeth.2024.08.004. Epub 2024 Aug 22.

DOI:10.1016/j.ymeth.2024.08.004
PMID:39179191
Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a single-stranded RNA virus, which mainly causes respiratory and enteric diseases and is responsible for the outbreak of coronavirus disease 19 (COVID-19). Numerous studies have demonstrated that SARS-CoV-2 infection will lead to a significant dysregulation of protein post-translational modification profile in human cells. The accurate recognition of phosphorylation sites in host cells will contribute to a deep understanding of the pathogenic mechanisms of SARS-CoV-2 and also help to screen drugs and compounds with antiviral potential. Therefore, there is a need to develop cost-effective and high-precision computational strategies for specifically identifying SARS-CoV-2-infected phosphorylation sites. In this work, we first implemented a custom neural network model (named PhosBERT) on the basis of a pre-trained protein language model of ProtBert, which was a self-supervised learning approach developed on the Bidirectional Encoder Representation from Transformers (BERT) architecture. PhosBERT was then trained and validated on serine (S) and threonine (T) phosphorylation dataset and tyrosine (Y) phosphorylation dataset with 5-fold cross-validation, respectively. Independent validation results showed that PhosBERT could identify S/T phosphorylation sites with high accuracy and AUC (area under the receiver operating characteristic) value of 81.9% and 0.896. The prediction accuracy and AUC value of Y phosphorylation sites reached up to 87.1% and 0.902. It indicated that the proposed model was of good prediction ability and stability and would provide a new approach for studying SARS-CoV-2 phosphorylation sites.

摘要

严重急性呼吸综合征冠状病毒 2(SARS-CoV-2)是一种单链 RNA 病毒,主要引起呼吸道和肠道疾病,是导致 19 冠状病毒病(COVID-19)的病原体。大量研究表明,SARS-CoV-2 感染会导致人体细胞中蛋白质翻译后修饰谱的显著失调。准确识别宿主细胞中的磷酸化位点有助于深入了解 SARS-CoV-2 的致病机制,还有助于筛选具有抗病毒潜力的药物和化合物。因此,需要开发具有成本效益和高精度的计算策略,以专门识别 SARS-CoV-2 感染的磷酸化位点。在这项工作中,我们首先在 ProtBert 上实现了一个基于预训练蛋白质语言模型的定制神经网络模型(命名为 PhosBERT),这是一种基于 Transformer 架构的双向编码器表示(BERT)的自监督学习方法。PhosBERT 分别在丝氨酸(S)和苏氨酸(T)磷酸化数据集和酪氨酸(Y)磷酸化数据集上进行了 5 折交叉验证训练和验证。独立验证结果表明,PhosBERT 可以识别 S/T 磷酸化位点,具有较高的准确性和 AUC(接收器操作特征曲线下的面积)值,分别为 81.9%和 0.896。Y 磷酸化位点的预测准确性和 AUC 值高达 87.1%和 0.902。这表明所提出的模型具有良好的预测能力和稳定性,为研究 SARS-CoV-2 磷酸化位点提供了一种新方法。

相似文献

1
PhosBERT: A self-supervised learning model for identifying phosphorylation sites in SARS-CoV-2-infected human cells.PhosBERT:一种用于识别 SARS-CoV-2 感染人类细胞中磷酸化位点的自监督学习模型。
Methods. 2024 Oct;230:140-146. doi: 10.1016/j.ymeth.2024.08.004. Epub 2024 Aug 22.
2
Advancing the accuracy of SARS-CoV-2 phosphorylation site detection via meta-learning approach.通过元学习方法提高 SARS-CoV-2 磷酸化位点检测的准确性。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad433.
3
Adapt-Kcr: a novel deep learning framework for accurate prediction of lysine crotonylation sites based on learning embedding features and attention architecture.Adapt-Kcr:一种基于学习嵌入特征和注意力架构的新型深度学习框架,用于准确预测赖氨酸巴豆酰化位点。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac037.
4
Computational prediction of phosphorylation sites of SARS-CoV-2 infection using feature fusion and optimization strategies.基于特征融合与优化策略的 SARS-CoV-2 感染磷酸化位点的计算预测。
Methods. 2024 Sep;229:1-8. doi: 10.1016/j.ymeth.2024.04.021. Epub 2024 May 18.
5
DeepIPs: comprehensive assessment and computational identification of phosphorylation sites of SARS-CoV-2 infection using a deep learning-based approach.DeepIPs:基于深度学习的方法对 SARS-CoV-2 感染的磷酸化位点进行全面评估和计算识别。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab244.
6
Adaptive learning embedding features to improve the predictive performance of SARS-CoV-2 phosphorylation sites.自适应学习嵌入特征,以提高 SARS-CoV-2 磷酸化位点的预测性能。
Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad627.
7
Analysis of Glycosylation and Disulfide Bonding of Wild-Type SARS-CoV-2 Spike Glycoprotein.野生型 SARS-CoV-2 刺突糖蛋白的糖基化和二硫键分析。
J Virol. 2022 Feb 9;96(3):e0162621. doi: 10.1128/JVI.01626-21. Epub 2021 Nov 24.
8
SARS-CoV-2 spike protein-mediated cell signaling in lung vascular cells.SARS-CoV-2 刺突蛋白介导的肺血管细胞中的细胞信号转导。
Vascul Pharmacol. 2021 Apr;137:106823. doi: 10.1016/j.vph.2020.106823. Epub 2020 Nov 21.
9
Genome-wide bioinformatics analysis of human protease capacity for proteolytic cleavage of the SARS-CoV-2 spike glycoprotein.对人类蛋白酶对 SARS-CoV-2 刺突糖蛋白进行蛋白水解切割的能力进行全基因组生物信息学分析。
Microbiol Spectr. 2024 Feb 6;12(2):e0353023. doi: 10.1128/spectrum.03530-23. Epub 2024 Jan 8.
10
, and Models for Monitoring SARS-CoV-2 Spike/Human ACE2 Complex, Viral Entry and Cell Fusion.用于监测 SARS-CoV-2 刺突/人 ACE2 复合物、病毒进入和细胞融合的模型。
Viruses. 2021 Feb 25;13(3):365. doi: 10.3390/v13030365.

引用本文的文献

1
Empirical Comparison and Analysis of Artificial Intelligence-Based Methods for Identifying Phosphorylation Sites of SARS-CoV-2 Infection.基于人工智能的新冠病毒感染磷酸化位点识别方法的实证比较与分析
Int J Mol Sci. 2024 Dec 21;25(24):13674. doi: 10.3390/ijms252413674.
2
GPS-pPLM: A Language Model for Prediction of Prokaryotic Phosphorylation Sites.GPS-pPLM:一种用于预测原核磷酸化位点的语言模型。
Cells. 2024 Nov 8;13(22):1854. doi: 10.3390/cells13221854.