用于蛋白质二级结构预测的多级组合分类器增强模型。

Multistage Combination Classifier Augmented Model for Protein Secondary Structure Prediction.

作者信息

Zhang Xu, Liu Yiwei, Wang Yaming, Zhang Liang, Feng Lin, Jin Bo, Zhang Hongzhe

机构信息

College of Mechanical Engineering, Dalian University of Technology, Dalian, China.

School of Innovation and Entrepreneurship, Dalian University of Technology, Dalian, China.

出版信息

Front Genet. 2022 May 23;13:769828. doi: 10.3389/fgene.2022.769828. eCollection 2022.

DOI:10.3389/fgene.2022.769828

PMID:35677562

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9170271/

Abstract

In the field of bioinformatics, understanding protein secondary structure is very important for exploring diseases and finding new treatments. Considering that the physical experiment-based protein secondary structure prediction methods are time-consuming and expensive, some pattern recognition and machine learning methods are proposed. However, most of the methods achieve quite similar performance, which seems to reach a model capacity bottleneck. As both model design and learning process can affect the model learning capacity, we pay attention to the latter part. To this end, a framework called Multistage Combination Classifier Augmented Model (MCCM) is proposed to solve the protein secondary structure prediction task. Specifically, first, a feature extraction module is introduced to extract features with different levels of learning difficulties. Second, multistage combination classifiers are proposed to learn decision boundaries for easy and hard samples, respectively, with the latter penalizing the loss value of the hard samples and finally improving the prediction performance of hard samples. Third, based on the Dirichlet distribution and information entropy measurement, a sample difficulty discrimination module is designed to assign samples with different learning difficulty levels to the aforementioned classifiers. The experimental results on the publicly available benchmark CB513 dataset show that our method outperforms most state-of-the-art models.

摘要

在生物信息学领域，理解蛋白质二级结构对于探索疾病和寻找新的治疗方法非常重要。鉴于基于物理实验的蛋白质二级结构预测方法既耗时又昂贵，因此提出了一些模式识别和机器学习方法。然而，大多数方法的性能相当相似，这似乎达到了模型能力瓶颈。由于模型设计和学习过程都会影响模型的学习能力，我们关注后者。为此，提出了一种名为多阶段组合分类器增强模型（MCCM）的框架来解决蛋白质二级结构预测任务。具体来说，首先，引入一个特征提取模块来提取具有不同学习难度水平的特征。其次，提出多阶段组合分类器，分别为简单样本和困难样本学习决策边界，后者对困难样本的损失值进行惩罚，最终提高困难样本的预测性能。第三，基于狄利克雷分布和信息熵度量，设计了一个样本难度判别模块，将具有不同学习难度水平的样本分配给上述分类器。在公开可用的基准CB513数据集上的实验结果表明，我们的方法优于大多数最先进的模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cab5/9170271/0d41a2fd6abd/fgene-13-769828-g001.jpg

相似文献

Multistage Combination Classifier Augmented Model for Protein Secondary Structure Prediction.

Front Genet. 2022 May 23;13:769828. doi: 10.3389/fgene.2022.769828. eCollection 2022.

DeepACLSTM: deep asymmetric convolutional long short-term memory neural models for protein secondary structure prediction.

BMC Bioinformatics. 2019 Jun 17;20(1):341. doi: 10.1186/s12859-019-2940-0.

circRNA-binding protein site prediction based on multi-view deep learning, subspace learning and multi-view classifier.

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab394.

DeepSSPred: A Deep Learning Based Sulfenylation Site Predictor Via a Novel nSegmented Optimize Federated Feature Encoder.

Protein Pept Lett. 2021;28(6):708-721. doi: 10.2174/0929866527666201202103411.

A novel approach for protein secondary structure prediction using encoder-decoder with attention mechanism model.

Biomol Concepts. 2024 Mar 13;15(1). doi: 10.1515/bmc-2022-0043. eCollection 2024 Jan 1.

A two-stage approach towards protein secondary structure classification.

Med Biol Eng Comput. 2020 Aug;58(8):1723-1737. doi: 10.1007/s11517-020-02194-w. Epub 2020 May 29.

Protein Secondary Structure Prediction Based on Data Partition and Semi-Random Subspace Method.

Sci Rep. 2018 Jun 29;8(1):9856. doi: 10.1038/s41598-018-28084-8.

Prediction of 8-state protein secondary structures by a novel deep learning architecture.

BMC Bioinformatics. 2018 Aug 3;19(1):293. doi: 10.1186/s12859-018-2280-5.

Protein Secondary Structure Prediction With a Reductive Deep Learning Method.

Front Bioeng Biotechnol. 2021 Jun 15;9:687426. doi: 10.3389/fbioe.2021.687426. eCollection 2021.

A Machine Learning Ensemble Classifier for Early Prediction of Diabetic Retinopathy.

J Med Syst. 2017 Nov 9;41(12):201. doi: 10.1007/s10916-017-0853-x.

引用本文的文献

Recent Advances in Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences.

Methods Mol Biol. 2025;2870:1-19. doi: 10.1007/978-1-0716-4213-9_1.

Prediction of protein secondary structure by the improved TCN-BiLSTM-MHA model with knowledge distillation.

Sci Rep. 2024 Jul 17;14(1):16488. doi: 10.1038/s41598-024-67403-0.

本文引用的文献

Protein Secondary Structure Prediction With a Reductive Deep Learning Method.

Front Bioeng Biotechnol. 2021 Jun 15;9:687426. doi: 10.3389/fbioe.2021.687426. eCollection 2021.

RIP1-dependent linear and nonlinear recruitments of caspase-8 and RIP3 respectively to necrosome specify distinct cell death outcomes.

Protein Cell. 2021 Nov;12(11):858-876. doi: 10.1007/s13238-020-00810-x. Epub 2021 Jan 2.

DNSS2: Improved ab initio protein secondary structure prediction using advanced deep learning architectures.

Proteins. 2021 Feb;89(2):207-217. doi: 10.1002/prot.26007. Epub 2020 Sep 16.

SAINT: self-attention augmented inception-inside-inception network improves protein secondary structure prediction.

Bioinformatics. 2020 Nov 1;36(17):4599-4608. doi: 10.1093/bioinformatics/btaa531.

Multifaceted analysis of training and testing convolutional neural networks for protein secondary structure prediction.

PLoS One. 2020 May 6;15(5):e0232528. doi: 10.1371/journal.pone.0232528. eCollection 2020.

MUFold-SSW: a new web server for predicting protein secondary structures, torsion angles and turns.

Bioinformatics. 2020 Feb 15;36(4):1293-1295. doi: 10.1093/bioinformatics/btz712.

Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks.

Bioinformatics. 2019 Jul 15;35(14):2403-2410. doi: 10.1093/bioinformatics/bty1006.

Prediction of 8-state protein secondary structures by a novel deep learning architecture.

BMC Bioinformatics. 2018 Aug 3;19(1):293. doi: 10.1186/s12859-018-2280-5.

MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction.

Proteins. 2018 May;86(5):592-598. doi: 10.1002/prot.25487. Epub 2018 Mar 12.

Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility.

Bioinformatics. 2017 Sep 15;33(18):2842-2849. doi: 10.1093/bioinformatics/btx218.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于蛋白质二级结构预测的多级组合分类器增强模型。

Multistage Combination Classifier Augmented Model for Protein Secondary Structure Prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献