• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过比较人类工程特征和深度表示来优化蛋白质丝氨酸磷酸化预测。

Optimization of serine phosphorylation prediction in proteins by comparing human engineered features and deep representations.

机构信息

Department of Computer Science, University of Management and Technology, Lahore, Pakistan.

National Center of Artificial Intelligence, Punjab University College of Information Technology, University of the Punjab, Lahore, Pakistan; Center for Professional & Applied Studies, Lahore, Pakistan.

出版信息

Anal Biochem. 2021 Feb 15;615:114069. doi: 10.1016/j.ab.2020.114069. Epub 2020 Dec 16.

DOI:10.1016/j.ab.2020.114069
PMID:33340540
Abstract

Deep representations can be used to replace human-engineered representations, as such features are constrained by certain limitations. For the prediction of protein post-translation modifications (PTMs) sites, research community uses different feature extraction techniques applied on Pseudo amino acid compositions (PseAAC). Serine phosphorylation is one of the most important PTM as it is the most occurring, and is important for various biological functions. Creating efficient representations from large protein sequences, to predict PTM sites, is a time and resource intensive task. In this study we propose, implement and evaluate use of Deep learning to learn effective protein data representations from PseAAC to develop data driven PTM detection systems and compare the same with two human representations.. The comparisons are performed by training an xgboost based classifier using each representation. The best scores were achieved by RNN-LSTM based deep representation and CNN based representation with an accuracy score of 81.1% and 78.3% respectively. Human engineered representations scored 77.3% and 74.9% respectively. Based on these results, it is concluded that the deep features are promising feature engineering replacement to identify PhosS sites in a very efficient and accurate manner which can help scientists understand the mechanism of this modification in proteins.

摘要

深度表示可以替代人工设计的表示,因为这些特征受到某些限制。对于预测蛋白质翻译后修饰(PTM)位点,研究界使用不同的特征提取技术应用于伪氨基酸组成(PseAAC)。丝氨酸磷酸化是最重要的 PTM 之一,因为它是最常见的,对各种生物功能很重要。从大型蛋白质序列中创建有效的表示形式来预测 PTM 位点是一项耗时且资源密集型的任务。在这项研究中,我们提出、实现和评估了使用深度学习从 PseAAC 中学习有效的蛋白质数据表示,以开发数据驱动的 PTM 检测系统,并将其与两种人工表示进行比较。通过使用每个表示来训练基于 xgboost 的分类器进行比较。最佳分数是由基于 RNN-LSTM 的深度表示和基于 CNN 的表示获得的,准确率分别为 81.1%和 78.3%。人工设计的表示分别获得了 77.3%和 74.9%的分数。基于这些结果,可以得出结论,深度特征是有前途的特征工程替代方法,可以非常高效和准确地识别 PhosS 位点,这有助于科学家理解蛋白质中这种修饰的机制。

相似文献

1
Optimization of serine phosphorylation prediction in proteins by comparing human engineered features and deep representations.通过比较人类工程特征和深度表示来优化蛋白质丝氨酸磷酸化预测。
Anal Biochem. 2021 Feb 15;615:114069. doi: 10.1016/j.ab.2020.114069. Epub 2020 Dec 16.
2
iPhosS(Deep)-PseAAC: Identification of Phosphoserine Sites in Proteins Using Deep Learning on General Pseudo Amino Acid Compositions.iPhosS(Deep)-PseAAC:基于广义伪氨基酸组成的深度学习算法鉴定蛋白质磷酸丝氨酸位点
IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1703-1714. doi: 10.1109/TCBB.2020.3040747. Epub 2022 Jun 3.
3
DeepPPSite: A deep learning-based model for analysis and prediction of phosphorylation sites using efficient sequence information.DeepPPSite:一种基于深度学习的模型,用于利用有效的序列信息分析和预测磷酸化位点。
Anal Biochem. 2021 Jan 1;612:113955. doi: 10.1016/j.ab.2020.113955. Epub 2020 Sep 16.
4
Predicting protein phosphorylation sites in soybean using interpretable deep tabular learning network.利用可解释的深度表格学习网络预测大豆中的蛋白质磷酸化位点。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac015.
5
A Hybrid Deep Learning Model for Predicting Protein Hydroxylation Sites.一种用于预测蛋白质羟基化位点的混合深度学习模型。
Int J Mol Sci. 2018 Sep 18;19(9):2817. doi: 10.3390/ijms19092817.
6
Boosting phosphorylation site prediction with sequence feature-based machine learning.基于序列特征的机器学习提高磷酸化位点预测。
Proteins. 2020 Feb;88(2):284-291. doi: 10.1002/prot.25801. Epub 2019 Aug 22.
7
Mini-review: Recent advances in post-translational modification site prediction based on deep learning.小型综述:基于深度学习的翻译后修饰位点预测的最新进展
Comput Struct Biotechnol J. 2022 Jun 30;20:3522-3532. doi: 10.1016/j.csbj.2022.06.045. eCollection 2022.
8
iGluK-Deep: computational identification of lysine glutarylation sites using deep neural networks with general pseudo amino acid compositions.iGluK-Deep:利用具有通用伪氨基酸组成的深度神经网络对赖氨酸戊二酰化位点进行计算识别。
J Biomol Struct Dyn. 2022;40(22):11691-11704. doi: 10.1080/07391102.2021.1962738. Epub 2021 Aug 16.
9
Large-scale comparative assessment of computational predictors for lysine post-translational modification sites.大规模比较评估赖氨酸翻译后修饰位点的计算预测因子。
Brief Bioinform. 2019 Nov 27;20(6):2267-2290. doi: 10.1093/bib/bby089.
10
iPhosY-PseAAC: identify phosphotyrosine sites by incorporating sequence statistical moments into PseAAC.iPhosY-PseAAC:通过将序列统计矩纳入伪氨基酸组成来识别磷酸酪氨酸位点。
Mol Biol Rep. 2018 Dec;45(6):2501-2509. doi: 10.1007/s11033-018-4417-z. Epub 2018 Oct 11.

引用本文的文献

1
DeepO-GlcNAc: a web server for prediction of protein O-GlcNAcylation sites using deep learning combined with attention mechanism.DeepO-GlcNAc:一种利用深度学习结合注意力机制预测蛋白质O-连接N-乙酰葡糖胺化位点的网络服务器。
Front Cell Dev Biol. 2024 Oct 10;12:1456728. doi: 10.3389/fcell.2024.1456728. eCollection 2024.
2
GAPS: a geometric attention-based network for peptide binding site identification by the transfer learning approach.GAPS:基于转移学习的肽结合位点识别的几何注意力网络。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae297.
3
m1A-Ensem: accurate identification of 1-methyladenosine sites through ensemble models.
m1A-Ensem:通过集成模型准确识别1-甲基腺苷位点。
BioData Min. 2024 Feb 15;17(1):4. doi: 10.1186/s13040-023-00353-x.
4
BBB-PEP-prediction: improved computational model for identification of blood-brain barrier peptides using blending position relative composition specific features and ensemble modeling.血脑屏障肽预测:利用混合位置相对组成特异性特征和集成建模改进的血脑屏障肽识别计算模型。
J Cheminform. 2023 Nov 18;15(1):110. doi: 10.1186/s13321-023-00773-1.
5
A Review of Machine Learning and Algorithmic Methods for Protein Phosphorylation Site Prediction.机器学习和算法方法在蛋白质磷酸化位点预测中的研究进展综述
Genomics Proteomics Bioinformatics. 2023 Dec;21(6):1266-1285. doi: 10.1016/j.gpb.2023.03.007. Epub 2023 Oct 19.
6
A Framework for Prediction of Oncogenomic Progression Aiding Personalized Treatment of Gastric Cancer.一种辅助胃癌个性化治疗的肿瘤基因组进展预测框架。
Diagnostics (Basel). 2023 Jul 6;13(13):2291. doi: 10.3390/diagnostics13132291.
7
Hemolytic-Pred: A machine learning-based predictor for hemolytic proteins using position and composition-based features.溶血预测器:一种基于机器学习的溶血蛋白预测工具,使用基于位置和组成的特征。
Digit Health. 2023 Jul 5;9:20552076231180739. doi: 10.1177/20552076231180739. eCollection 2023 Jan-Dec.
8
Ensemble Learning for Hormone Binding Protein Prediction: A Promising Approach for Early Diagnosis of Thyroid Hormone Disorders in Serum.用于激素结合蛋白预测的集成学习:血清甲状腺激素紊乱早期诊断的一种有前景的方法。
Diagnostics (Basel). 2023 Jun 1;13(11):1940. doi: 10.3390/diagnostics13111940.
9
EDLM: Ensemble Deep Learning Model to Detect Mutation for the Early Detection of Cholangiocarcinoma.EDLM:用于胆管癌早期检测的突变检测集成深度学习模型。
Genes (Basel). 2023 May 18;14(5):1104. doi: 10.3390/genes14051104.
10
MIND-S is a deep-learning prediction model for elucidating protein post-translational modifications in human diseases.MIND-S 是一种用于阐明人类疾病中蛋白质翻译后修饰的深度学习预测模型。
Cell Rep Methods. 2023 Mar 27;3(3):100430. doi: 10.1016/j.crmeth.2023.100430.