Suppr超能文献

MDD-carb:一种用于识别具有底物基序的蛋白质羰基化位点的组合模型。

MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs.

作者信息

Kao Hui-Ju, Weng Shun-Long, Huang Kai-Yao, Kaunang Fergie Joanda, Hsu Justin Bo-Kai, Huang Chien-Hsun, Lee Tzong-Yi

机构信息

Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, city, 320, Taiwan.

Department of Medicine, Mackay Medical College, New Taipei City, 252, Taiwan.

出版信息

BMC Syst Biol. 2017 Dec 21;11(Suppl 7):137. doi: 10.1186/s12918-017-0511-4.

Abstract

BACKGROUND

Carbonylation, which takes place through oxidation of reactive oxygen species (ROS) on specific residues, is an irreversibly oxidative modification of proteins. It has been reported that the carbonylation is related to a number of metabolic or aging diseases including diabetes, chronic lung disease, Parkinson's disease, and Alzheimer's disease. Due to the lack of computational methods dedicated to exploring motif signatures of protein carbonylation sites, we were motivated to exploit an iterative statistical method to characterize and identify carbonylated sites with motif signatures.

RESULTS

By manually curating experimental data from research articles, we obtained 332, 144, 135, and 140 verified substrate sites for K (lysine), R (arginine), T (threonine), and P (proline) residues, respectively, from 241 carbonylated proteins. In order to examine the informative attributes for classifying between carbonylated and non-carbonylated sites, multifarious features including composition of twenty amino acids (AAC), composition of amino acid pairs (AAPC), position-specific scoring matrix (PSSM), and positional weighted matrix (PWM) were investigated in this study. Additionally, in an attempt to explore the motif signatures of carbonylation sites, an iterative statistical method was adopted to detect statistically significant dependencies of amino acid compositions between specific positions around substrate sites. Profile hidden Markov model (HMM) was then utilized to train a predictive model from each motif signature. Moreover, based on the method of support vector machine (SVM), we adopted it to construct an integrative model by combining the values of bit scores obtained from profile HMMs. The combinatorial model could provide an enhanced performance with evenly predictive sensitivity and specificity in the evaluation of cross-validation and independent testing.

CONCLUSION

This study provides a new scheme for exploring potential motif signatures at substrate sites of protein carbonylation. The usefulness of the revealed motifs in the identification of carbonylated sites is demonstrated by their effective performance in cross-validation and independent testing. Finally, these substrate motifs were adopted to build an available online resource (MDD-Carb, http://csb.cse.yzu.edu.tw/MDDCarb/ ) and are also anticipated to facilitate the study of large-scale carbonylated proteomes.

摘要

背景

羰基化是蛋白质不可逆的氧化修饰,通过活性氧(ROS)对特定残基的氧化作用发生。据报道,羰基化与多种代谢或衰老疾病有关,包括糖尿病、慢性肺病、帕金森病和阿尔茨海默病。由于缺乏专门用于探索蛋白质羰基化位点基序特征的计算方法,我们有动力开发一种迭代统计方法来表征和识别具有基序特征的羰基化位点。

结果

通过人工整理研究文章中的实验数据,我们分别从241个羰基化蛋白质中获得了332个、144个、135个和140个经验证的K(赖氨酸)、R(精氨酸)、T(苏氨酸)和P(脯氨酸)残基的底物位点。为了研究用于区分羰基化和非羰基化位点的信息属性,本研究考察了多种特征,包括二十种氨基酸组成(AAC)、氨基酸对组成(AAPC)、位置特异性评分矩阵(PSSM)和位置加权矩阵(PWM)。此外,为了探索羰基化位点的基序特征,采用迭代统计方法检测底物位点周围特定位置之间氨基酸组成的统计显著依赖性。然后利用轮廓隐马尔可夫模型(HMM)从每个基序特征训练预测模型。此外,基于支持向量机(SVM)方法,我们通过组合从轮廓HMM获得的比特得分值来构建一个整合模型。该组合模型在交叉验证和独立测试评估中可以提供具有均匀预测敏感性和特异性的增强性能。

结论

本研究提供了一种探索蛋白质羰基化底物位点潜在基序特征的新方案。所揭示的基序在羰基化位点识别中的有用性通过其在交叉验证和独立测试中的有效性能得到证明。最后,这些底物基序被用于构建一个可用的在线资源(MDD-Carb,http://csb.cse.yzu.edu.tw/MDDCarb/ ),并且预计也将促进大规模羰基化蛋白质组的研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5fc/5763492/fb6d5f92bb64/12918_2017_511_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验