• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于多数类样本选择和合成少数类过采样技术的支持向量机预测赖氨酸甲酰化位点。

Prediction of lysine formylation sites using support vector machine based on the sample selection from majority classes and synthetic minority over-sampling techniques.

机构信息

Dept. of Computer Science and Engineering, Rajshahi University of Engineering and Technology, Rajshahi, Bangladesh; Dept. of Computer Science and Engineering, Hajee Mohammad Danesh Science and Technology University, Dinajpur, Bangladesh.

Dept. of Computer Science and Engineering, Rajshahi University of Engineering and Technology, Rajshahi, Bangladesh.

出版信息

Biochimie. 2022 Jan;192:125-135. doi: 10.1016/j.biochi.2021.10.001. Epub 2021 Oct 7.

DOI:10.1016/j.biochi.2021.10.001
PMID:34627982
Abstract

Lysine formylation is a newly discovered and mostly interested type of post-translational modification (PTM) that is generally found on core and linker histone proteins of prokaryote and eukaryote and plays various important roles on the regulation of various cellular mechanisms. Hence, it is very urgent to properly identify formylation site in protein for understanding the molecular mechanism of formylation deeply and defining drug for relevant diseases. As experimentally identification of formylation site using traditional processes are expensive and time consuming, a simple and high speedy mathematical model for predicting accurately lysine formylation sites is highly desired. A useful computational model named PLF_SVM is deigned and proposed in this study by using binary encoding (BE), amino acid composition (AAC), reverse position relative incidence matrix (RPRIM), position relative incidence matrix (PRIM), and position specific amino acid propensity (PSAAP) feature generation methods for predicting formylated and non-formylated lysine sites. Besides, the Synthetic Minority Oversampling Technique (SMOTE) and a proposed sample selection strategy named EnSVM are applied to handle the imbalance training dataset problem. Thereafter, the optimal number of features are selected by F-score method to train the model. Finally, it has been seen that PLF_SVM outperforms the state-of-the-art approaches in validation and independent test with an accuracy of 98.61% and 98.77% respectively. At https://plf-svm.herokuapp.com/, a user-friendly web tool is also created for identifying formylation sites. Therefore, the proposed method may be helpful guideline for the analysis and prediction of formylated lysine and knowing the process of cellular regulation.

摘要

赖氨酸甲酰化是一种新发现的、备受关注的翻译后修饰(PTM)类型,通常存在于原核生物和真核生物的核心和连接组蛋白中,在调节各种细胞机制方面发挥着各种重要作用。因此,正确识别蛋白质中的甲酰化位点对于深入了解甲酰化的分子机制和定义相关疾病的药物非常紧迫。由于使用传统方法实验鉴定甲酰化位点既昂贵又耗时,因此非常需要设计和提出一种简单、快速的数学模型来准确预测赖氨酸甲酰化位点。在这项研究中,我们设计并提出了一种名为 PLF_SVM 的有用计算模型,该模型使用二进制编码(BE)、氨基酸组成(AAC)、反向位置相对发生率矩阵(RPRIM)、位置相对发生率矩阵(PRIM)和位置特异性氨基酸倾向(PSAAP)特征生成方法来预测甲酰化和非甲酰化赖氨酸位点。此外,还应用了合成少数过采样技术(SMOTE)和一种名为 EnSVM 的建议样本选择策略来处理不平衡训练数据集问题。然后,通过 F 分数法选择最佳特征数来训练模型。最后,PLF_SVM 在验证和独立测试中都表现优于最新方法,准确率分别为 98.61%和 98.77%。在 https://plf-svm.herokuapp.com/,我们还创建了一个用户友好的网络工具,用于识别甲酰化位点。因此,该方法可能有助于分析和预测甲酰化赖氨酸,并了解细胞调节过程。

相似文献

1
Prediction of lysine formylation sites using support vector machine based on the sample selection from majority classes and synthetic minority over-sampling techniques.基于多数类样本选择和合成少数类过采样技术的支持向量机预测赖氨酸甲酰化位点。
Biochimie. 2022 Jan;192:125-135. doi: 10.1016/j.biochi.2021.10.001. Epub 2021 Oct 7.
2
Formator: Predicting Lysine Formylation Sites Based on the Most Distant Undersampling and Safe-Level Synthetic Minority Oversampling.基于最远距离欠采样和安全级别合成少数过采样的赖氨酸甲酰化位点预测
IEEE/ACM Trans Comput Biol Bioinform. 2021 Sep-Oct;18(5):1937-1945. doi: 10.1109/TCBB.2019.2957758. Epub 2021 Oct 7.
3
Prediction of lysine formylation sites using the composition of k-spaced amino acid pairs via Chou's 5-steps rule and general pseudo components.利用 Chou 的 5 步规则和广义伪氨基酸组成预测赖氨酸酰化位点。
Genomics. 2020 Jan;112(1):859-866. doi: 10.1016/j.ygeno.2019.05.027. Epub 2019 Jun 6.
4
Nepsilon-formylation of lysine is a widespread post-translational modification of nuclear proteins occurring at residues involved in regulation of chromatin function.赖氨酸的N-ε-甲酰化是一种广泛存在的核蛋白翻译后修饰,发生在参与染色质功能调控的残基上。
Nucleic Acids Res. 2008 Feb;36(2):570-7. doi: 10.1093/nar/gkm1057. Epub 2007 Dec 1.
5
Prediction of lysine propionylation sites using biased SVM and incorporating four different sequence features into Chou's PseAAC.利用有偏支持向量机并将四种不同序列特征纳入周氏伪氨基酸组成对赖氨酸丙酰化位点进行预测。
J Mol Graph Model. 2017 Sep;76:356-363. doi: 10.1016/j.jmgm.2017.07.022. Epub 2017 Jul 25.
6
dForml(KNN)-PseAAC: Detecting formylation sites from protein sequences using K-nearest neighbor algorithm via Chou's 5-step rule and pseudo components.dForml(KNN)-PseAAC:基于 K 近邻算法和 Chou 的五步法则及伪氨基酸组成,从蛋白质序列中预测甲酰化位点。
J Theor Biol. 2019 Jun 7;470:43-49. doi: 10.1016/j.jtbi.2019.03.011. Epub 2019 Mar 14.
7
iDPGK: characterization and identification of lysine phosphoglycerylation sites based on sequence-based features.iDPGK:基于序列特征的赖氨酸磷酸甘油化位点的表征和鉴定。
BMC Bioinformatics. 2020 Dec 9;21(1):568. doi: 10.1186/s12859-020-03916-5.
8
Identify and analysis crotonylation sites in histone by using support vector machines.利用支持向量机鉴定和分析组蛋白中的巴豆酰化位点。
Artif Intell Med. 2017 Nov;83:75-81. doi: 10.1016/j.artmed.2017.02.007. Epub 2017 Mar 7.
9
Prediction of protein N-formylation using the composition of k-spaced amino acid pairs.利用k间隔氨基酸对的组成预测蛋白质N-甲酰化
Anal Biochem. 2017 Oct 1;534:40-45. doi: 10.1016/j.ab.2017.07.011. Epub 2017 Jul 11.
10
PLP_FS: prediction of lysine phosphoglycerylation sites in protein using support vector machine and fusion of multiple F_Score feature selection.使用支持向量机和融合多个 F-Score 特征选择的方法预测蛋白质中的赖氨酸磷酸化糖基化位点
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac306.

引用本文的文献

1
DLBWE-Cys: a deep-learning-based tool for identifying cysteine S-carboxyethylation sites using binary-weight encoding.DLBWE-Cys:一种基于深度学习的工具,用于使用二进制权重编码识别半胱氨酸S-羧乙基化位点。
Front Genet. 2025 Jan 8;15:1464976. doi: 10.3389/fgene.2024.1464976. eCollection 2024.
2
Predictive modeling for postoperative delirium in elderly patients with abdominal malignancies using synthetic minority oversampling technique.使用合成少数过采样技术对老年腹部恶性肿瘤患者术后谵妄进行预测建模。
World J Gastrointest Oncol. 2024 Apr 15;16(4):1227-1235. doi: 10.4251/wjgo.v16.i4.1227.