• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多因素特征对蛋白质二级结构预测的影响。

Impact of Multi-Factor Features on Protein Secondary Structure Prediction.

机构信息

College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China.

出版信息

Biomolecules. 2024 Sep 13;14(9):1155. doi: 10.3390/biom14091155.

DOI:10.3390/biom14091155
PMID:39334921
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11430196/
Abstract

Protein secondary structure prediction (PSSP) plays a crucial role in resolving protein functions and properties. Significant progress has been made in this field in recent years, and the use of a variety of protein-related features, including amino acid sequences, position-specific score matrices (PSSM), amino acid properties, and secondary structure trend factors, to improve prediction accuracy is an important technical route for it. However, a comprehensive evaluation of the impact of these factor features in secondary structure prediction is lacking in the current work. This study quantitatively analyzes the impact of several major factors on secondary structure prediction models using a more explanatory four-class machine learning approach. The applicability of each factor in the different types of methods, the extent to which the different methods work on each factor, and the evaluation of the effect of multi-factor combinations are explored in detail. Through experiments and analyses, it was found that PSSM performs best in methods with strong high-dimensional features and complex feature extraction capabilities, while amino acid sequences, although performing poorly overall, perform relatively well in methods with strong linear processing capabilities. Also, the combination of amino acid properties and trend factors significantly improved the prediction performance. This study provides empirical evidence for future researchers to optimize multi-factor feature combinations and apply them to protein secondary structure prediction models, which is beneficial in further optimizing the use of these factors to enhance the performance of protein secondary structure prediction models.

摘要

蛋白质二级结构预测(PSSP)在解析蛋白质功能和性质方面起着至关重要的作用。近年来,该领域取得了重大进展,使用各种与蛋白质相关的特征,包括氨基酸序列、位置特异性评分矩阵(PSSM)、氨基酸性质和二级结构趋势因子,以提高预测准确性是其重要的技术途径。然而,目前的工作缺乏对这些因素特征在二级结构预测中影响的综合评估。本研究使用更具解释性的四分类机器学习方法,定量分析了几个主要因素对二级结构预测模型的影响。详细探讨了每个因素在不同类型方法中的适用性、不同方法对每个因素的作用程度以及多因素组合效果的评估。通过实验和分析发现,PSSM 在具有强高维特征和复杂特征提取能力的方法中表现最佳,而氨基酸序列虽然整体表现不佳,但在具有强线性处理能力的方法中表现相对较好。此外,氨基酸性质和趋势因子的组合显著提高了预测性能。本研究为未来的研究人员提供了经验证据,以优化多因素特征组合,并将其应用于蛋白质二级结构预测模型,这有利于进一步优化这些因素的利用,从而提高蛋白质二级结构预测模型的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/428d5eed4e2d/biomolecules-14-01155-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/6e4f3878ebe5/biomolecules-14-01155-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/5c52b4c979f5/biomolecules-14-01155-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/edaef48ffd41/biomolecules-14-01155-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/d04762e503f6/biomolecules-14-01155-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/7a14622d9adf/biomolecules-14-01155-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/6fcc1ad88361/biomolecules-14-01155-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/428d5eed4e2d/biomolecules-14-01155-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/6e4f3878ebe5/biomolecules-14-01155-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/5c52b4c979f5/biomolecules-14-01155-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/edaef48ffd41/biomolecules-14-01155-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/d04762e503f6/biomolecules-14-01155-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/7a14622d9adf/biomolecules-14-01155-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/6fcc1ad88361/biomolecules-14-01155-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad9/11430196/428d5eed4e2d/biomolecules-14-01155-g007.jpg

相似文献

1
Impact of Multi-Factor Features on Protein Secondary Structure Prediction.多因素特征对蛋白质二级结构预测的影响。
Biomolecules. 2024 Sep 13;14(9):1155. doi: 10.3390/biom14091155.
2
Protein secondary structure prediction: A survey of the state of the art.蛋白质二级结构预测:最新技术综述。
J Mol Graph Model. 2017 Sep;76:379-402. doi: 10.1016/j.jmgm.2017.07.015. Epub 2017 Jul 19.
3
Prediction of protein structural classes for low-similarity sequences using reduced PSSM and position-based secondary structural features.使用简化的位置特异性得分矩阵(PSSM)和基于位置的二级结构特征预测低相似性序列的蛋白质结构类别。
Gene. 2015 Jan 10;554(2):241-8. doi: 10.1016/j.gene.2014.10.037. Epub 2014 Oct 24.
4
Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences.使用递归特征选择和随机森林提高低相似度序列的蛋白质结构分类预测。
Comput Math Methods Med. 2021 May 7;2021:5529389. doi: 10.1155/2021/5529389. eCollection 2021.
5
Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments.基于预测的二级结构集合和多重比对,以超过80%的准确率预测β转角。
BMC Bioinformatics. 2008 Oct 10;9:430. doi: 10.1186/1471-2105-9-430.
6
A secondary structure-based position-specific scoring matrix applied to the improvement in protein secondary structure prediction.基于二级结构的位置特异性评分矩阵在提高蛋白质二级结构预测中的应用。
PLoS One. 2021 Jul 28;16(7):e0255076. doi: 10.1371/journal.pone.0255076. eCollection 2021.
7
Lightweight ProteinUnet2 network for protein secondary structure prediction: a step towards proper evaluation.用于蛋白质二级结构预测的轻量化 ProteinUnet2 网络:迈向正确评估的一步。
BMC Bioinformatics. 2022 Mar 22;23(1):100. doi: 10.1186/s12859-022-04623-z.
8
Protein secondary structure prediction improved by recurrent neural networks integrated with two-dimensional convolutional neural networks.通过与二维卷积神经网络集成的循环神经网络改进蛋白质二级结构预测。
J Bioinform Comput Biol. 2018 Oct;16(5):1850021. doi: 10.1142/S021972001850021X.
9
SVM-based method for protein structural class prediction using secondary structural content and structural information of amino acids.基于支持向量机的蛋白质结构类预测方法,该方法利用二级结构含量和氨基酸的结构信息。
J Bioinform Comput Biol. 2011 Aug;9(4):489-502. doi: 10.1142/s0219720011005422.
10
Real value prediction of protein solvent accessibility using enhanced PSSM features.使用增强的位置特异性得分矩阵(PSSM)特征对蛋白质溶剂可及性进行实际值预测。
BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S12. doi: 10.1186/1471-2105-9-S12-S12.

引用本文的文献

1
Twenty years of advances in prediction of nucleic acid-binding residues in protein sequences.蛋白质序列中核酸结合残基预测二十年进展
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf016.
2
AAindexNC: Estimating the Physicochemical Properties of Non-Canonical Amino Acids, Including Those Derived from the PDB and PDBeChem Databank.AAindexNC:估算非标准氨基酸的物理化学性质,包括那些源自蛋白质数据库(PDB)和蛋白质数据银行化学数据库(PDBeChem)的非标准氨基酸。
Int J Mol Sci. 2024 Nov 22;25(23):12555. doi: 10.3390/ijms252312555.

本文引用的文献

1
Recent Advances and Challenges in Protein Structure Prediction.蛋白质结构预测的最新进展与挑战。
J Chem Inf Model. 2024 Jan 8;64(1):76-95. doi: 10.1021/acs.jcim.3c01324. Epub 2023 Dec 18.
2
Protein secondary structure prediction based on Wasserstein generative adversarial networks and temporal convolutional networks with convolutional block attention modules.基于瓦瑟斯坦生成对抗网络、带有卷积块注意力模块的时间卷积网络的蛋白质二级结构预测
Math Biosci Eng. 2023 Jan;20(2):2203-2218. doi: 10.3934/mbe.2023102. Epub 2022 Nov 17.
3
Explainable Deep Hypergraph Learning Modeling the Peptide Secondary Structure Prediction.
可解释的深度超图学习模型在肽二级结构预测中的应用。
Adv Sci (Weinh). 2023 Apr;10(11):e2206151. doi: 10.1002/advs.202206151. Epub 2023 Feb 15.
4
Determination of Secondary Structure of Proteins by Nanoinfrared Spectroscopy.利用纳米红外光谱法测定蛋白质的二级结构。
Anal Chem. 2023 Jan 17;95(2):621-627. doi: 10.1021/acs.analchem.2c01431. Epub 2023 Jan 4.
5
Learning meaningful representations of protein sequences.学习蛋白质序列有意义的表示方法。
Nat Commun. 2022 Apr 8;13(1):1914. doi: 10.1038/s41467-022-29443-w.
6
A secondary structure-based position-specific scoring matrix applied to the improvement in protein secondary structure prediction.基于二级结构的位置特异性评分矩阵在提高蛋白质二级结构预测中的应用。
PLoS One. 2021 Jul 28;16(7):e0255076. doi: 10.1371/journal.pone.0255076. eCollection 2021.
7
PSSP-MVIRT: peptide secondary structure prediction based on a multi-view deep learning architecture.基于多视图深度学习架构的肽二级结构预测(PSSP-MVIRT)。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab203.
8
A Comparison of Random Forest Variable Selection Methods for Classification Prediction Modeling.用于分类预测建模的随机森林变量选择方法比较
Expert Syst Appl. 2019 Nov 15;134:93-101. doi: 10.1016/j.eswa.2019.05.028. Epub 2019 May 23.
9
Effects of Distance Measure Choice on K-Nearest Neighbor Classifier Performance: A Review.距离度量选择对 K-最近邻分类器性能的影响:综述
Big Data. 2019 Dec;7(4):221-248. doi: 10.1089/big.2018.0175. Epub 2019 Aug 14.
10
Physicochemical Position-Dependent Properties in the Protein Secondary Structures.蛋白质二级结构中物理化学位置依赖性特性
Iran Biomed J. 2019 Jul;23(4):253-61. doi: 10.29252/.23.4.253. Epub 2019 Apr 7.