• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用Bagging多序列比对学习,通过深度学习增强低质量位置特异性评分矩阵以进行准确蛋白质结构特性预测的综合研究

Comprehensive Study on Enhancing Low-Quality Position-Specific Scoring Matrix with Deep Learning for Accurate Protein Structure Property Prediction: Using Bagging Multiple Sequence Alignment Learning.

作者信息

Guo Yuzhi, Wu Jiaxiang, Ma Hehuan, Wang Sheng, Huang Junzhou

机构信息

Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas, USA.

Tencent AI Lab, Shenzhen, China.

出版信息

J Comput Biol. 2021 Apr;28(4):346-361. doi: 10.1089/cmb.2020.0416. Epub 2021 Feb 22.

DOI:10.1089/cmb.2020.0416
PMID:33617347
Abstract

Accurate predictions of protein structure properties, for example, secondary structure and solvent accessibility, are essential in analyzing the structure and function of a protein. Position-specific scoring matrix (PSSM) features are widely used in the structure property prediction. However, some proteins may have low-quality PSSM features due to insufficient homologous sequences, leading to limited prediction accuracy. To address this limitation, we propose an enhancing scheme for PSSM features. We introduce the "Bagging MSA" (multiple sequence alignment) method to calculate PSSM features used to train our model, adopt a convolutional network to capture local context features and bidirectional long short-term memory for long-term dependencies, and integrate them under an unsupervised framework. Structure property prediction models are then built upon such enhanced PSSM features for more accurate predictions. Moreover, we develop two frameworks to evaluate the effectiveness of the enhanced PSSM features, which also bring proposed method into real-world scenarios. Empirical evaluation of CB513, CASP11, and CASP12 data sets indicates that our unsupervised enhancing scheme indeed generates more informative PSSM features for structure property prediction.

摘要

准确预测蛋白质的结构特性,例如二级结构和溶剂可及性,对于分析蛋白质的结构和功能至关重要。位置特异性得分矩阵(PSSM)特征在结构特性预测中被广泛使用。然而,由于同源序列不足,一些蛋白质可能具有质量较低的PSSM特征,导致预测准确性有限。为了解决这一局限性,我们提出了一种PSSM特征增强方案。我们引入“Bagging MSA”(多序列比对)方法来计算用于训练我们模型的PSSM特征,采用卷积网络来捕获局部上下文特征,并使用双向长短期记忆来处理长期依赖性,并在无监督框架下将它们整合起来。然后基于这种增强的PSSM特征构建结构特性预测模型,以进行更准确的预测。此外,我们开发了两个框架来评估增强的PSSM特征的有效性,这也将所提出的方法带入实际应用场景。对CB513、CASP11和CASP12数据集的实证评估表明,我们的无监督增强方案确实为结构特性预测生成了更多信息丰富的PSSM特征。

相似文献

1
Comprehensive Study on Enhancing Low-Quality Position-Specific Scoring Matrix with Deep Learning for Accurate Protein Structure Property Prediction: Using Bagging Multiple Sequence Alignment Learning.利用Bagging多序列比对学习,通过深度学习增强低质量位置特异性评分矩阵以进行准确蛋白质结构特性预测的综合研究
J Comput Biol. 2021 Apr;28(4):346-361. doi: 10.1089/cmb.2020.0416. Epub 2021 Feb 22.
2
EPTool: A New Enhancing PSSM Tool for Protein Secondary Structure Prediction.EPTool:一种用于蛋白质二级结构预测的新型增强 PSSM 工具。
J Comput Biol. 2021 Apr;28(4):362-364. doi: 10.1089/cmb.2020.0417. Epub 2020 Dec 1.
3
Prior knowledge facilitates low homologous protein secondary structure prediction with DSM distillation.先验知识有助于通过 DSM 蒸馏进行低同源蛋白二级结构预测。
Bioinformatics. 2022 Jul 11;38(14):3574-3581. doi: 10.1093/bioinformatics/btac351.
4
A secondary structure-based position-specific scoring matrix applied to the improvement in protein secondary structure prediction.基于二级结构的位置特异性评分矩阵在提高蛋白质二级结构预测中的应用。
PLoS One. 2021 Jul 28;16(7):e0255076. doi: 10.1371/journal.pone.0255076. eCollection 2021.
5
rawMSA: End-to-end Deep Learning using raw Multiple Sequence Alignments.rawMSA:使用原始多序列比对的端到端深度学习。
PLoS One. 2019 Aug 15;14(8):e0220182. doi: 10.1371/journal.pone.0220182. eCollection 2019.
6
Prediction of 8-state protein secondary structures by a novel deep learning architecture.一种新型深度学习架构预测 8 态蛋白质二级结构。
BMC Bioinformatics. 2018 Aug 3;19(1):293. doi: 10.1186/s12859-018-2280-5.
7
IGPRED: Combination of convolutional neural and graph convolutional networks for protein secondary structure prediction.IGPRED:卷积神经网络和图卷积网络的组合用于蛋白质二级结构预测。
Proteins. 2021 Oct;89(10):1277-1288. doi: 10.1002/prot.26149. Epub 2021 May 25.
8
Efficient utilization on PSSM combining with recurrent neural network for membrane protein types prediction.利用 PSSM 与递归神经网络相结合提高膜蛋白类型预测效率。
Comput Biol Chem. 2019 Aug;81:9-15. doi: 10.1016/j.compbiolchem.2019.107094. Epub 2019 Aug 8.
9
MFTrans: A multi-feature transformer network for protein secondary structure prediction.MFTrans:一种用于蛋白质二级结构预测的多特征变换网络。
Int J Biol Macromol. 2024 May;267(Pt 1):131311. doi: 10.1016/j.ijbiomac.2024.131311. Epub 2024 Apr 9.
10
Hybrid framework for membrane protein type prediction based on the PSSM.基于 PSSM 的膜蛋白类型预测的混合框架。
Sci Rep. 2024 Jul 26;14(1):17156. doi: 10.1038/s41598-024-68163-7.

引用本文的文献

1
Machine learning approaches for predicting protein-ligand binding sites from sequence data.从序列数据预测蛋白质-配体结合位点的机器学习方法。
Front Bioinform. 2025 Feb 3;5:1520382. doi: 10.3389/fbinf.2025.1520382. eCollection 2025.
2
Deep Ensemble Learning with Atrous Spatial Pyramid Networks for Protein Secondary Structure Prediction.基于空洞空间金字塔网络的深度集成学习用于蛋白质二级结构预测
Biomolecules. 2022 Jun 2;12(6):774. doi: 10.3390/biom12060774.
3
FEPS: A Tool for Feature Extraction from Protein Sequence.FEPS:一种从蛋白质序列中提取特征的工具。
Methods Mol Biol. 2022;2499:65-104. doi: 10.1007/978-1-0716-2317-6_3.