Suppr超能文献

RFCM-PALM:针对雄性/雌性小鼠数据的突触蛋白中S-棕榈酰化位点的计算机模拟预测

RFCM-PALM: In-Silico Prediction of S-Palmitoylation Sites in the Synaptic Proteins for Male/Female Mouse Data.

作者信息

Bandyopadhyay Soumyendu Sekhar, Halder Anup Kumar, Zaręba-Kozioł Monika, Bartkowiak-Kaczmarek Anna, Dutta Aviinandaan, Chatterjee Piyali, Nasipuri Mita, Wójtowicz Tomasz, Wlodarczyk Jakub, Basu Subhadip

机构信息

Department of Computer Science and Engineering, Jadvapur University, Kolkata 700032, India.

Department of Computer Science and Engineering, School of Engineering and Technology, Adamas University, Barasat, Kolkata 700126, India.

出版信息

Int J Mol Sci. 2021 Sep 14;22(18):9901. doi: 10.3390/ijms22189901.

Abstract

S-palmitoylation is a reversible covalent post-translational modification of cysteine thiol side chain by palmitic acid. S-palmitoylation plays a critical role in a variety of biological processes and is engaged in several human diseases. Therefore, identifying specific sites of this modification is crucial for understanding their functional consequences in physiology and pathology. We present a random forest (RF) classifier-based consensus strategy (RFCM-PALM) for predicting the palmitoylated cysteine sites on synaptic proteins from male/female mouse data. To design the prediction model, we have introduced a heuristic strategy for selection of the optimum set of physicochemical features from the AAIndex dataset using (a) K-Best (KB) features, (b) genetic algorithm (GA), and (c) a union (UN) of KB and GA based features. Furthermore, decisions from best-trained models of the KB, GA, and UN-based classifiers are combined by designing a three-star quality consensus strategy to further refine and enhance the scores of the individual models. The experiment is carried out on three categorized synaptic protein datasets of a male mouse, female mouse, and combined (male + female), whereas in each group, weighted data is used as training, and knock-out is used as the hold-out set for performance evaluation and comparison. RFCM-PALM shows ~80% area under curve (AUC) score in all three categories of datasets and achieve 10% average accuracy (male-15%, female-15%, and combined-7%) improvements on the hold-out set compared to the approaches. To summarize, our method with efficient feature selection and novel consensus strategy shows significant performance gains in the prediction of S-palmitoylation sites in mouse datasets.

摘要

S-棕榈酰化是一种通过棕榈酸对半胱氨酸硫醇侧链进行的可逆共价翻译后修饰。S-棕榈酰化在多种生物过程中发挥关键作用,并与多种人类疾病相关。因此,确定这种修饰的特定位点对于理解其在生理和病理中的功能后果至关重要。我们提出了一种基于随机森林(RF)分类器的共识策略(RFCM-PALM),用于从雄性/雌性小鼠数据中预测突触蛋白上的棕榈酰化半胱氨酸位点。为了设计预测模型,我们引入了一种启发式策略,使用(a)K-最佳(KB)特征、(b)遗传算法(GA)和(c)基于KB和GA的特征的联合(UN)从AAIndex数据集中选择最佳的理化特征集。此外,通过设计三星级质量共识策略,将基于KB、GA和UN的分类器的最佳训练模型的决策进行组合,以进一步细化和提高各个模型的分数。实验在雄性小鼠、雌性小鼠和组合(雄性+雌性)的三个分类突触蛋白数据集上进行,而在每组中,加权数据用作训练,敲除数据用作性能评估和比较的保留集。RFCM-PALM在所有三类数据集中的曲线下面积(AUC)得分约为80%,与其他方法相比,在保留集上的平均准确率提高了10%(雄性-15%,雌性-15%,组合-7%)。总之,我们具有高效特征选择和新颖共识策略的方法在预测小鼠数据集中的S-棕榈酰化位点方面显示出显著的性能提升。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50fd/8467992/1f95bb3187f4/ijms-22-09901-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验