• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

iSuc-ChiDT:一种使用统计差异表编码和卡方决策表分类器识别琥珀酰化位点的计算方法。

iSuc-ChiDT: a computational method for identifying succinylation sites using statistical difference table encoding and the chi-square decision table classifier.

作者信息

Zeng Ying, Chen Yuan, Yuan Zheming

机构信息

School of Computer and Communication, Hunan Institute of Engineering, Xiangtan, 411104, Hunan, China.

Hunan Engineering & Technology Research Center for Agricultural Big Data Analysis & Decision-making, Hunan Agricultural University, Changsha, 410128, Hunan, China.

出版信息

BioData Min. 2022 Feb 10;15(1):3. doi: 10.1186/s13040-022-00290-1.

DOI:10.1186/s13040-022-00290-1
PMID:35144656
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8832670/
Abstract

BACKGROUND

Lysine succinylation is a type of protein post-translational modification which is widely involved in cell differentiation, cell metabolism and other important physiological activities. To study the molecular mechanism of succinylation in depth, succinylation sites need to be accurately identified, and because experimental approaches are costly and time-consuming, there is a great demand for reliable computational methods. Feature extraction is a key step in building succinylation site prediction models, and the development of effective new features improves predictive accuracy. Because the number of false succinylation sites far exceeds that of true sites, traditional classifiers perform poorly, and designing a classifier to effectively handle highly imbalanced datasets has always been a challenge.

RESULTS

A new computational method, iSuc-ChiDT, is proposed to identify succinylation sites in proteins. In iSuc-ChiDT, chi-square statistical difference table encoding is developed to extract positional features, and has a higher predictive accuracy and fewer features compared to common position-based encoding schemes such as binary encoding and physicochemical property encoding. Single amino acid and undirected pair-coupled amino acid composition features are supplemented to improve the fault tolerance for residue insertions and deletions. After feature selection by Chi-MIC-share algorithm, the chi-square decision table (ChiDT) classifier is constructed for imbalanced classification. With a training set of 4748:50,551(true: false sites), ChiDT clearly outperforms traditional classifiers in predictive accuracy, and runs fast. Using an independent testing set of experimentally identified succinylation sites, iSuc-ChiDT achieves a sensitivity of 70.47%, a specificity of 66.27%, a Matthews correlation coefficient of 0.205, and a global accuracy index Q of 0.683, showing a significant improvement in sensitivity and overall accuracy compared to PSuccE, Success, SuccinSite, and other existing succinylation site predictors.

CONCLUSIONS

iSuc-ChiDT shows great promise in predicting succinylation sites and is expected to facilitate further experimental investigation of protein succinylation.

摘要

背景

赖氨酸琥珀酰化是一种蛋白质翻译后修饰,广泛参与细胞分化、细胞代谢等重要生理活动。为深入研究琥珀酰化的分子机制,需要准确鉴定琥珀酰化位点,由于实验方法成本高且耗时,因此对可靠的计算方法有很大需求。特征提取是构建琥珀酰化位点预测模型的关键步骤,开发有效的新特征可提高预测准确性。由于假琥珀酰化位点的数量远远超过真位点,传统分类器表现不佳,设计一个能有效处理高度不平衡数据集的分类器一直是一项挑战。

结果

提出了一种新的计算方法iSuc-ChiDT来鉴定蛋白质中的琥珀酰化位点。在iSuc-ChiDT中,开发了卡方统计差异表编码来提取位置特征,与二进制编码和理化性质编码等常见的基于位置的编码方案相比,具有更高的预测准确性和更少的特征。补充了单氨基酸和无向对耦合氨基酸组成特征,以提高对残基插入和缺失的容错能力。通过卡方互信息共享算法进行特征选择后,构建卡方决策表(ChiDT)分类器用于不平衡分类。在4748:50551(真:假位点)的训练集上,ChiDT在预测准确性方面明显优于传统分类器,且运行速度快。使用实验鉴定的琥珀酰化位点的独立测试集,iSuc-ChiDT的灵敏度为70.47%,特异性为66.27%,马修斯相关系数为0.205,全局准确性指数Q为0.683,与PSuccE、Success、SuccinSite等现有琥珀酰化位点预测器相比,灵敏度和整体准确性有显著提高。

结论

iSuc-ChiDT在预测琥珀酰化位点方面显示出巨大潜力,有望促进蛋白质琥珀酰化的进一步实验研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b65/8832670/d7fd34cd7e24/13040_2022_290_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b65/8832670/7198091baf91/13040_2022_290_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b65/8832670/759766330e98/13040_2022_290_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b65/8832670/a0e423312436/13040_2022_290_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b65/8832670/d7fd34cd7e24/13040_2022_290_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b65/8832670/7198091baf91/13040_2022_290_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b65/8832670/759766330e98/13040_2022_290_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b65/8832670/a0e423312436/13040_2022_290_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b65/8832670/d7fd34cd7e24/13040_2022_290_Fig4_HTML.jpg

相似文献

1
iSuc-ChiDT: a computational method for identifying succinylation sites using statistical difference table encoding and the chi-square decision table classifier.iSuc-ChiDT:一种使用统计差异表编码和卡方决策表分类器识别琥珀酰化位点的计算方法。
BioData Min. 2022 Feb 10;15(1):3. doi: 10.1186/s13040-022-00290-1.
2
CBDT-Oglyc: Prediction of O-glycosylation sites using ChiMIC-based balanced decision table and feature selection.CBDT-Oglyc:基于 ChiMIC 的平衡决策表和特征选择预测 O-糖基化位点。
J Bioinform Comput Biol. 2023 Oct;21(5):2350024. doi: 10.1142/S0219720023500245. Epub 2023 Oct 28.
3
A high-performance approach for predicting donor splice sites based on short window size and imbalanced large samples.基于短窗口大小和不平衡大样本的供体剪接位点预测的高性能方法。
Biol Direct. 2019 Apr 11;14(1):6. doi: 10.1186/s13062-019-0236-y.
4
SuccinSite: a computational tool for the prediction of protein succinylation sites by exploiting the amino acid patterns and properties.琥珀酰化位点预测工具SuccinSite:利用氨基酸模式和特性预测蛋白质琥珀酰化位点的计算工具。
Mol Biosyst. 2016 Mar;12(3):786-95. doi: 10.1039/c5mb00853k. Epub 2016 Jan 7.
5
Detecting Succinylation sites from protein sequences using ensemble support vector machine.基于集成支持向量机从蛋白质序列中检测琥珀酰化位点。
BMC Bioinformatics. 2018 Jun 25;19(1):237. doi: 10.1186/s12859-018-2249-4.
6
An Improved Computational Prediction Model for Lysine Succinylation Sites Mapping on Fusing Three Sequence Encoding Schemes with the Random Forest Classifier.一种改进的计算预测模型,用于通过融合三种序列编码方案与随机森林分类器来映射赖氨酸琥珀酰化位点
Curr Genomics. 2021 Feb;22(2):122-136. doi: 10.2174/1389202922666210219114211.
7
iSuc-PseOpt: Identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset.iSuc-PseOpt:通过将序列耦合效应纳入伪组件并优化不平衡训练数据集来识别蛋白质中的赖氨酸琥珀酰化位点。
Anal Biochem. 2016 Mar 15;497:48-56. doi: 10.1016/j.ab.2015.12.009. Epub 2015 Dec 23.
8
A systematic identification of species-specific protein succinylation sites using joint element features information.利用联合元件特征信息对物种特异性蛋白质琥珀酰化位点进行系统鉴定。
Int J Nanomedicine. 2017 Aug 28;12:6303-6315. doi: 10.2147/IJN.S140875. eCollection 2017.
9
pSuc-FFSEA: Predicting Lysine Succinylation Sites in Proteins Based on Feature Fusion and Stacking Ensemble Algorithm.pSuc-FFSEA:基于特征融合和堆叠集成算法预测蛋白质中的赖氨酸琥珀酰化位点
Front Cell Dev Biol. 2022 May 24;10:894874. doi: 10.3389/fcell.2022.894874. eCollection 2022.
10
iSuc-PseAAC: predicting lysine succinylation in proteins by incorporating peptide position-specific propensity.iSuc-PseAAC:通过纳入肽段位置特异性倾向预测蛋白质中的赖氨酸琥珀酰化
Sci Rep. 2015 Jun 18;5:10184. doi: 10.1038/srep10184.

引用本文的文献

1
RLSuccSite: succinylation sites prediction based on reinforcement learning dynamic with balanced reward mechanism and three-peaks enhanced method for physicochemical property scores.RLSuccSite:基于具有平衡奖励机制的强化学习动态和物理化学性质分数的三峰增强方法的琥珀酰化位点预测
J Cheminform. 2025 Jun 2;17(1):92. doi: 10.1186/s13321-025-01034-z.
2
pSuc-EDBAM: Predicting lysine succinylation sites in proteins based on ensemble dense blocks and an attention module.pSuc-EDBAM:基于集成密集块和注意力模块预测蛋白质中的赖氨酸琥珀酰化位点。
BMC Bioinformatics. 2022 Oct 31;23(1):450. doi: 10.1186/s12859-022-05001-5.

本文引用的文献

1
Chi-MIC-share: a new feature selection algorithm for quantitative structure-activity relationship models.Chi-MIC-share:一种用于定量构效关系模型的新特征选择算法。
RSC Adv. 2020 May 27;10(34):19852-19860. doi: 10.1039/d0ra00061b. eCollection 2020 May 26.
2
A high-performance approach for predicting donor splice sites based on short window size and imbalanced large samples.基于短窗口大小和不平衡大样本的供体剪接位点预测的高性能方法。
Biol Direct. 2019 Apr 11;14(1):6. doi: 10.1186/s13062-019-0236-y.
3
Detecting Succinylation sites from protein sequences using ensemble support vector machine.
基于集成支持向量机从蛋白质序列中检测琥珀酰化位点。
BMC Bioinformatics. 2018 Jun 25;19(1):237. doi: 10.1186/s12859-018-2249-4.
4
Improving succinylation prediction accuracy by incorporating the secondary structure via helix, strand and coil, and evolutionary information from profile bigrams.通过纳入螺旋、链和卷曲的二级结构以及来自轮廓双字母组的进化信息来提高琥珀酰化预测准确性。
PLoS One. 2018 Feb 12;13(2):e0191900. doi: 10.1371/journal.pone.0191900. eCollection 2018.
5
Success: evolutionary and structural properties of amino acids prove effective for succinylation site prediction.成功:氨基酸的进化和结构特性证明对琥珀酰化位点预测有效。
BMC Genomics. 2018 Jan 19;19(Suppl 1):923. doi: 10.1186/s12864-017-4336-8.
6
The first succinylome profile of Trichophyton rubrum reveals lysine succinylation on proteins involved in various key cellular processes.红色毛癣菌的首个琥珀酰化蛋白质组图谱揭示了参与各种关键细胞过程的蛋白质上的赖氨酸琥珀酰化修饰。
BMC Genomics. 2017 Aug 4;18(1):577. doi: 10.1186/s12864-017-3977-y.
7
PSSM-Suc: Accurately predicting succinylation using position specific scoring matrix into bigram for feature extraction.PSSM-Suc:利用位置特异性评分矩阵将双字母组用于特征提取,准确预测琥珀酰化。
J Theor Biol. 2017 Jul 21;425:97-102. doi: 10.1016/j.jtbi.2017.05.005. Epub 2017 May 5.
8
SucStruct: Prediction of succinylated lysine residues by using structural properties of amino acids.SucStruct:利用氨基酸的结构特性预测琥珀酰化赖氨酸残基
Anal Biochem. 2017 Jun 15;527:24-32. doi: 10.1016/j.ab.2017.03.021. Epub 2017 Mar 28.
9
A New Algorithm to Optimize Maximal Information Coefficient.一种优化最大信息系数的新算法。
PLoS One. 2016 Jun 22;11(6):e0157567. doi: 10.1371/journal.pone.0157567. eCollection 2016.
10
pSuc-Lys: Predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach.pSuc-Lys:利用伪氨基酸组成和集成随机森林方法预测蛋白质中的赖氨酸琥珀酰化位点。
J Theor Biol. 2016 Apr 7;394:223-230. doi: 10.1016/j.jtbi.2016.01.020. Epub 2016 Jan 22.