• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于预测局部蛋白质性质的统一多任务架构。

A unified multitask architecture for predicting local protein properties.

机构信息

Machine Learning Department, NEC Labs America, Princeton, New Jersey, United States of America.

出版信息

PLoS One. 2012;7(3):e32235. doi: 10.1371/journal.pone.0032235. Epub 2012 Mar 26.

DOI:10.1371/journal.pone.0032235
PMID:22461885
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3312883/
Abstract

A variety of functionally important protein properties, such as secondary structure, transmembrane topology and solvent accessibility, can be encoded as a labeling of amino acids. Indeed, the prediction of such properties from the primary amino acid sequence is one of the core projects of computational biology. Accordingly, a panoply of approaches have been developed for predicting such properties; however, most such approaches focus on solving a single task at a time. Motivated by recent, successful work in natural language processing, we propose to use multitask learning to train a single, joint model that exploits the dependencies among these various labeling tasks. We describe a deep neural network architecture that, given a protein sequence, outputs a host of predicted local properties, including secondary structure, solvent accessibility, transmembrane topology, signal peptides and DNA-binding residues. The network is trained jointly on all these tasks in a supervised fashion, augmented with a novel form of semi-supervised learning in which the model is trained to distinguish between local patterns from natural and synthetic protein sequences. The task-independent architecture of the network obviates the need for task-specific feature engineering. We demonstrate that, for all of the tasks that we considered, our approach leads to statistically significant improvements in performance, relative to a single task neural network approach, and that the resulting model achieves state-of-the-art performance.

摘要

各种功能重要的蛋白质性质,如二级结构、跨膜拓扑和溶剂可及性,都可以编码为氨基酸的标记。事实上,从一级氨基酸序列预测这些性质是计算生物学的核心项目之一。因此,已经开发了许多方法来预测这些性质;然而,大多数这样的方法都专注于一次解决单个任务。受自然语言处理领域最近成功工作的启发,我们提出使用多任务学习来训练一个单一的联合模型,该模型利用了这些各种标记任务之间的依赖性。我们描述了一种深度神经网络架构,该架构给定一个蛋白质序列,输出许多预测的局部性质,包括二级结构、溶剂可及性、跨膜拓扑、信号肽和 DNA 结合残基。该网络以监督的方式在所有这些任务上进行联合训练,并辅以一种新颖的半监督学习形式,即模型被训练来区分天然和合成蛋白质序列中的局部模式。网络的任务独立架构避免了对特定任务的特征工程的需求。我们证明,对于我们考虑的所有任务,与单一任务神经网络方法相比,我们的方法在性能上都有统计学上的显著提高,并且得到的模型实现了最先进的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c19/3312883/22d18f29d7cb/pone.0032235.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c19/3312883/b388c474ab20/pone.0032235.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c19/3312883/162f83d1b73c/pone.0032235.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c19/3312883/f7b177d46c54/pone.0032235.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c19/3312883/22d18f29d7cb/pone.0032235.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c19/3312883/b388c474ab20/pone.0032235.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c19/3312883/162f83d1b73c/pone.0032235.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c19/3312883/f7b177d46c54/pone.0032235.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c19/3312883/22d18f29d7cb/pone.0032235.g004.jpg

相似文献

1
A unified multitask architecture for predicting local protein properties.一种用于预测局部蛋白质性质的统一多任务架构。
PLoS One. 2012;7(3):e32235. doi: 10.1371/journal.pone.0032235. Epub 2012 Mar 26.
2
AcconPred: Predicting Solvent Accessibility and Contact Number Simultaneously by a Multitask Learning Framework under the Conditional Neural Fields Model.AcconPred:在条件神经场模型下通过多任务学习框架同时预测溶剂可及性和接触数
Biomed Res Int. 2015;2015:678764. doi: 10.1155/2015/678764. Epub 2015 Aug 3.
3
Improved protein relative solvent accessibility prediction using deep multi-view feature learning framework.利用深度多视图特征学习框架提高蛋白质相对溶剂可及性预测。
Anal Biochem. 2021 Oct 15;631:114358. doi: 10.1016/j.ab.2021.114358. Epub 2021 Aug 31.
4
DeepHelicon: Accurate prediction of inter-helical residue contacts in transmembrane proteins by residual neural networks.DeepHelicon:通过残差神经网络准确预测跨膜蛋白中螺旋间残基接触。
J Struct Biol. 2020 Oct 1;212(1):107574. doi: 10.1016/j.jsb.2020.107574. Epub 2020 Jul 11.
5
Prediction of 8-state protein secondary structures by a novel deep learning architecture.一种新型深度学习架构预测 8 态蛋白质二级结构。
BMC Bioinformatics. 2018 Aug 3;19(1):293. doi: 10.1186/s12859-018-2280-5.
6
ProteinUnet-An efficient alternative to SPIDER3-single for sequence-based prediction of protein secondary structures.ProteinUnet—一种比 SPIDER3-single 更高效的基于序列的蛋白质二级结构预测方法。
J Comput Chem. 2021 Jan 5;42(1):50-59. doi: 10.1002/jcc.26432. Epub 2020 Oct 15.
7
IGPRED-MultiTask: A Deep Learning Model to Predict Protein Secondary Structure, Torsion Angles and Solvent Accessibility.IGPRED-MultiTask:一种用于预测蛋白质二级结构、扭转角和溶剂可及性的深度学习模型。
IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1104-1113. doi: 10.1109/TCBB.2022.3191395. Epub 2023 Apr 3.
8
PaleAle 5.0: prediction of protein relative solvent accessibility by deep learning.PaleAle 5.0:通过深度学习预测蛋白质相对溶剂可及性。
Amino Acids. 2019 Sep;51(9):1289-1296. doi: 10.1007/s00726-019-02767-6. Epub 2019 Aug 6.
9
Evaluation of methods for predicting the topology of beta-barrel outer membrane proteins and a consensus prediction method.β-桶状外膜蛋白拓扑结构预测方法的评估及一种共识预测方法
BMC Bioinformatics. 2005 Jan 12;6:7. doi: 10.1186/1471-2105-6-7.
10
High-Order Convolutional Neural Network Architecture for Predicting DNA-Protein Binding Sites.用于预测 DNA-蛋白质结合位点的高阶卷积神经网络架构。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1184-1192. doi: 10.1109/TCBB.2018.2819660. Epub 2018 Mar 26.

引用本文的文献

1
Deep learning for protein secondary structure prediction: Pre and post-AlphaFold.用于蛋白质二级结构预测的深度学习:AlphaFold之前与之后。
Comput Struct Biotechnol J. 2022 Nov 11;20:6271-6286. doi: 10.1016/j.csbj.2022.11.012. eCollection 2022.
2
Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field.蛋白质科学与人工智能相遇:跨领域的系统评价与生化荟萃分析
Front Bioeng Biotechnol. 2022 Jul 7;10:788300. doi: 10.3389/fbioe.2022.788300. eCollection 2022.
3
Machine-Learned Molecular Surface and Its Application to Implicit Solvent Simulations.

本文引用的文献

1
Towards a membrane proteome in Drosophila: a method for the isolation of plasma membrane.朝向果蝇的膜蛋白质组学:质膜的一种分离方法。
BMC Genomics. 2010 May 12;11:302. doi: 10.1186/1471-2164-11-302.
2
Prediction of the human membrane proteome.人类膜蛋白组预测。
Proteomics. 2010 Mar;10(6):1141-9. doi: 10.1002/pmic.200900258.
3
Prediction of backbone dihedral angles and protein secondary structure using support vector machines.利用支持向量机预测骨架二面角和蛋白质二级结构。
基于机器学习的分子表面及其在隐溶剂模拟中的应用。
J Chem Theory Comput. 2021 Oct 12;17(10):6214-6224. doi: 10.1021/acs.jctc.1c00492. Epub 2021 Sep 13.
4
Applications of artificial intelligence to drug design and discovery in the big data era: a comprehensive review.人工智能在大数据时代在药物设计和发现中的应用:全面综述。
Mol Divers. 2021 Aug;25(3):1643-1664. doi: 10.1007/s11030-021-10237-z. Epub 2021 Jun 10.
5
Boosting the accuracy of protein secondary structure prediction through nearest neighbor search and method hybridization.通过最近邻搜索和方法杂交提高蛋白质二级结构预测的准确性。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i317-i325. doi: 10.1093/bioinformatics/btaa336.
6
Precise Modeling of the Protective Effects of Quercetin against Mycotoxin via System Identification with Neural Networks.基于神经网络的系统辨识对槲皮素抗真菌毒素保护作用的精确建模。
Int J Mol Sci. 2019 Apr 8;20(7):1725. doi: 10.3390/ijms20071725.
7
Multi-trait, Multi-environment Deep Learning Modeling for Genomic-Enabled Prediction of Plant Traits.用于基于基因组的植物性状预测的多性状、多环境深度学习建模
G3 (Bethesda). 2018 Dec 10;8(12):3829-3840. doi: 10.1534/g3.118.200728.
8
Predicting human protein function with multi-task deep neural networks.用多任务深度神经网络预测人类蛋白质功能。
PLoS One. 2018 Jun 11;13(6):e0198216. doi: 10.1371/journal.pone.0198216. eCollection 2018.
9
Opportunities and obstacles for deep learning in biology and medicine.深度学习在生物学和医学中的机遇与挑战。
J R Soc Interface. 2018 Apr;15(141). doi: 10.1098/rsif.2017.0387.
10
Predicting the Functional Impact of KCNQ1 Variants of Unknown Significance.预测意义未明的KCNQ1基因变异的功能影响。
Circ Cardiovasc Genet. 2017 Oct;10(5). doi: 10.1161/CIRCGENETICS.117.001754.
BMC Bioinformatics. 2009 Dec 22;10:437. doi: 10.1186/1471-2105-10-437.
4
The membrane proteome of the mouse lens fiber cell.小鼠晶状体纤维细胞的膜蛋白质组。
Mol Vis. 2009 Nov 24;15:2448-63.
5
CCHMM_PROF: a HMM-based coiled-coil predictor with evolutionary information.CCHMM_PROF:一种基于隐马尔可夫模型的带有进化信息的卷曲螺旋预测器。
Bioinformatics. 2009 Nov 1;25(21):2757-63. doi: 10.1093/bioinformatics/btp539. Epub 2009 Sep 10.
6
Transmembrane topology and signal peptide prediction using dynamic bayesian networks.使用动态贝叶斯网络进行跨膜拓扑结构和信号肽预测。
PLoS Comput Biol. 2008 Nov;4(11):e1000213. doi: 10.1371/journal.pcbi.1000213. Epub 2008 Nov 7.
7
PREDICT-2ND: a tool for generalized protein local structure prediction.PREDICT - 2ND:一种用于广义蛋白质局部结构预测的工具。
Bioinformatics. 2008 Nov 1;24(21):2453-9. doi: 10.1093/bioinformatics/btn438. Epub 2008 Aug 30.
8
Prediction of membrane-protein topology from first principles.基于第一性原理预测膜蛋白拓扑结构。
Proc Natl Acad Sci U S A. 2008 May 20;105(20):7177-81. doi: 10.1073/pnas.0711151105. Epub 2008 May 13.
9
The Jpred 3 secondary structure prediction server.Jpred 3二级结构预测服务器。
Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W197-201. doi: 10.1093/nar/gkn238. Epub 2008 May 7.
10
BLOSUM62 miscalculations improve search performance.BLOSUM62算法的误算可提高搜索性能。
Nat Biotechnol. 2008 Mar;26(3):274-5. doi: 10.1038/nbt0308-274.