• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ProteinUnet—一种比 SPIDER3-single 更高效的基于序列的蛋白质二级结构预测方法。

ProteinUnet-An efficient alternative to SPIDER3-single for sequence-based prediction of protein secondary structures.

机构信息

Department of Applied Informatics, Silesian University of Technology, Gliwice, Poland.

Department of Bioinformatics and Telemedicine, Jagiellonian University Medical College, Kraków, Poland.

出版信息

J Comput Chem. 2021 Jan 5;42(1):50-59. doi: 10.1002/jcc.26432. Epub 2020 Oct 15.

DOI:10.1002/jcc.26432
PMID:33058261
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7756333/
Abstract

Predicting protein function and structure from sequence remains an unsolved problem in bioinformatics. The best performing methods rely heavily on evolutionary information from multiple sequence alignments, which means their accuracy deteriorates for sequences with a few homologs, and given the increasing sequence database sizes requires long computation times. Here, a single-sequence-based prediction method is presented, called ProteinUnet, leveraging an U-Net convolutional network architecture. It is compared to SPIDER3-Single model, based on long short-term memory-bidirectional recurrent neural networks architecture. Both methods achieve similar results for prediction of secondary structures (both three- and eight-state), half-sphere exposure, and contact number, but ProteinUnet has two times fewer parameters, 17 times shorter inference time, and can be trained 11 times faster. Moreover, ProteinUnet tends to be better for short sequences and residues with a low number of local contacts. Additionally, the method of loss weighting is presented as an effective way of increasing accuracy for rare secondary structures.

摘要

从序列预测蛋白质功能和结构仍然是生物信息学中的一个未解决的问题。表现最好的方法严重依赖于来自多序列比对的进化信息,这意味着它们的准确性对于具有少数同源物的序列会降低,并且随着序列数据库大小的增加,需要较长的计算时间。在这里,提出了一种基于单序列的预测方法,称为 ProteinUnet,利用 U-Net 卷积网络架构。将其与基于长短期记忆-双向递归神经网络架构的 SPIDER3-Single 模型进行比较。这两种方法在预测二级结构(三态和八态)、半球暴露和接触数方面都取得了相似的结果,但 ProteinUnet 的参数少两倍,推断时间短 17 倍,训练速度快 11 倍。此外,ProteinUnet 更适合短序列和局部接触数较少的残基。此外,还提出了一种损失加权方法,作为提高稀有二级结构准确性的有效方法。

相似文献

1
ProteinUnet-An efficient alternative to SPIDER3-single for sequence-based prediction of protein secondary structures.ProteinUnet—一种比 SPIDER3-single 更高效的基于序列的蛋白质二级结构预测方法。
J Comput Chem. 2021 Jan 5;42(1):50-59. doi: 10.1002/jcc.26432. Epub 2020 Oct 15.
2
Single-sequence-based prediction of protein secondary structures and solvent accessibility by deep whole-sequence learning.基于单序列的深度学习全序列预测蛋白质二级结构和溶剂可及性。
J Comput Chem. 2018 Oct 5;39(26):2210-2216. doi: 10.1002/jcc.25534. Epub 2018 Oct 14.
3
Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility.利用长短期记忆双向递归神经网络捕捉非局部相互作用,提高蛋白质二级结构、主链角度、接触数和溶剂可及性的预测能力。
Bioinformatics. 2017 Sep 15;33(18):2842-2849. doi: 10.1093/bioinformatics/btx218.
4
SPOT-1D-Single: improving the single-sequence-based prediction of protein secondary structure, backbone angles, solvent accessibility and half-sphere exposures using a large training set and ensembled deep learning.SPOT-1D-单序列:利用大型训练集和集成深度学习改进基于单序列的蛋白质二级结构、主链角度、溶剂可及性和半球暴露预测。
Bioinformatics. 2021 Oct 25;37(20):3464-3472. doi: 10.1093/bioinformatics/btab316.
5
DeepECA: an end-to-end learning framework for protein contact prediction from a multiple sequence alignment.DeepECA:一种基于多重序列比对的蛋白质接触预测端到端学习框架。
BMC Bioinformatics. 2020 Jan 9;21(1):10. doi: 10.1186/s12859-019-3190-x.
6
Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks.利用预测的接触图和递归与残差卷积神经网络的集合来改进蛋白质二级结构、主链角度、溶剂可及性和接触数的预测。
Bioinformatics. 2019 Jul 15;35(14):2403-2410. doi: 10.1093/bioinformatics/bty1006.
7
Improved protein relative solvent accessibility prediction using deep multi-view feature learning framework.利用深度多视图特征学习框架提高蛋白质相对溶剂可及性预测。
Anal Biochem. 2021 Oct 15;631:114358. doi: 10.1016/j.ab.2021.114358. Epub 2021 Aug 31.
8
Protein secondary structure prediction improved by recurrent neural networks integrated with two-dimensional convolutional neural networks.通过与二维卷积神经网络集成的循环神经网络改进蛋白质二级结构预测。
J Bioinform Comput Biol. 2018 Oct;16(5):1850021. doi: 10.1142/S021972001850021X.
9
Prediction of 8-state protein secondary structures by a novel deep learning architecture.一种新型深度学习架构预测 8 态蛋白质二级结构。
BMC Bioinformatics. 2018 Aug 3;19(1):293. doi: 10.1186/s12859-018-2280-5.
10
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.基于超深度学习模型的蛋白质接触图从头精确预测
PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.

引用本文的文献

1
Advancements in one-dimensional protein structure prediction using machine learning and deep learning.利用机器学习和深度学习进行一维蛋白质结构预测的进展。
Comput Struct Biotechnol J. 2025 Apr 3;27:1416-1430. doi: 10.1016/j.csbj.2025.04.005. eCollection 2025.
2
PaleAle 6.0: Prediction of Protein Relative Solvent Accessibility by Leveraging Pre-Trained Language Models (PLMs).淡色艾尔6.0:利用预训练语言模型预测蛋白质相对溶剂可及性
Biomolecules. 2025 Jan 2;15(1):49. doi: 10.3390/biom15010049.
3
Recent Advances in Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences.

本文引用的文献

1
Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumors.全自动深度学习赋能的脑肿瘤 DCE-MRI 分析系统。
Artif Intell Med. 2020 Jan;102:101769. doi: 10.1016/j.artmed.2019.101769. Epub 2019 Nov 27.
2
Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13).使用多个深度神经网络进行蛋白质结构预测在第十三届蛋白质结构预测关键评估 (CASP13) 中。
Proteins. 2019 Dec;87(12):1141-1148. doi: 10.1002/prot.25834.
3
Protein secondary structure prediction using neural networks and deep learning: A review.
从蛋白质序列预测二级和超二级结构的计算方法的最新进展
Methods Mol Biol. 2025;2870:1-19. doi: 10.1007/978-1-0716-4213-9_1.
4
Improving antibody optimization ability of generative adversarial network through large language model.通过大语言模型提高生成对抗网络的抗体优化能力。
Comput Struct Biotechnol J. 2023 Nov 23;21:5839-5850. doi: 10.1016/j.csbj.2023.11.041. eCollection 2023.
5
BERT2DAb: a pre-trained model for antibody representation based on amino acid sequences and 2D-structure.BERT2DAb:基于氨基酸序列和 2D 结构的抗体表示预训练模型。
MAbs. 2023 Jan-Dec;15(1):2285904. doi: 10.1080/19420862.2023.2285904. Epub 2023 Nov 27.
6
Machine Learning Methods for Small Data Challenges in Molecular Science.机器学习方法在分子科学中小数据挑战中的应用。
Chem Rev. 2023 Jul 12;123(13):8736-8780. doi: 10.1021/acs.chemrev.3c00189. Epub 2023 Jun 29.
7
AttSec: protein secondary structure prediction by capturing local patterns from attention map.AttSec:通过从注意力图中捕获局部模式来预测蛋白质二级结构。
BMC Bioinformatics. 2023 May 4;24(1):183. doi: 10.1186/s12859-023-05310-3.
8
Deep learning for protein secondary structure prediction: Pre and post-AlphaFold.用于蛋白质二级结构预测的深度学习:AlphaFold之前与之后。
Comput Struct Biotechnol J. 2022 Nov 11;20:6271-6286. doi: 10.1016/j.csbj.2022.11.012. eCollection 2022.
9
Reaching alignment-profile-based accuracy in predicting protein secondary and tertiary structural properties without alignment.无需对齐即可达到基于对齐轮廓的预测蛋白质二级和三级结构性质的准确性。
Sci Rep. 2022 May 9;12(1):7607. doi: 10.1038/s41598-022-11684-w.
10
Lightweight ProteinUnet2 network for protein secondary structure prediction: a step towards proper evaluation.用于蛋白质二级结构预测的轻量化 ProteinUnet2 网络:迈向正确评估的一步。
BMC Bioinformatics. 2022 Mar 22;23(1):100. doi: 10.1186/s12859-022-04623-z.
基于神经网络和深度学习的蛋白质二级结构预测:综述。
Comput Biol Chem. 2019 Aug;81:1-8. doi: 10.1016/j.compbiolchem.2019.107093. Epub 2019 Aug 12.
4
Getting to Know Your Neighbor: Protein Structure Prediction Comes of Age with Contextual Machine Learning.了解你的邻居:蛋白质结构预测借助上下文机器学习走向成熟。
J Comput Biol. 2020 May;27(5):796-814. doi: 10.1089/cmb.2019.0193. Epub 2019 Aug 30.
5
Deep learning for electroencephalogram (EEG) classification tasks: a review.深度学习在脑电图(EEG)分类任务中的应用:综述。
J Neural Eng. 2019 Jun;16(3):031001. doi: 10.1088/1741-2552/ab0ab5. Epub 2019 Feb 26.
6
Single-sequence-based prediction of protein secondary structures and solvent accessibility by deep whole-sequence learning.基于单序列的深度学习全序列预测蛋白质二级结构和溶剂可及性。
J Comput Chem. 2018 Oct 5;39(26):2210-2216. doi: 10.1002/jcc.25534. Epub 2018 Oct 14.
7
Protein secondary structure prediction: A survey of the state of the art.蛋白质二级结构预测:最新技术综述。
J Mol Graph Model. 2017 Sep;76:379-402. doi: 10.1016/j.jmgm.2017.07.015. Epub 2017 Jul 19.
8
Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility.利用长短期记忆双向递归神经网络捕捉非局部相互作用,提高蛋白质二级结构、主链角度、接触数和溶剂可及性的预测能力。
Bioinformatics. 2017 Sep 15;33(18):2842-2849. doi: 10.1093/bioinformatics/btx218.
9
Protein structure determination using metagenome sequence data.利用宏基因组序列数据进行蛋白质结构测定。
Science. 2017 Jan 20;355(6322):294-298. doi: 10.1126/science.aah4043.
10
Sixty-five years of the long march in protein secondary structure prediction: the final stretch?蛋白质二级结构预测的长征:终章?
Brief Bioinform. 2018 May 1;19(3):482-494. doi: 10.1093/bib/bbw129.