通过任务差异进行多任务蛋白质功能预测。

Multitask Protein Function Prediction through Task Dissimilarity.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2019 Sep-Oct;16(5):1550-1560. doi: 10.1109/TCBB.2017.2684127. Epub 2017 Mar 17.

DOI:10.1109/TCBB.2017.2684127

Abstract

Automated protein function prediction is a challenging problem with distinctive features, such as the hierarchical organization of protein functions and the scarcity of annotated proteins for most biological functions. We propose a multitask learning algorithm addressing both issues. Unlike standard multitask algorithms, which use task (protein functions) similarity information as a bias to speed up learning, we show that dissimilarity information enforces separation of rare class labels from frequent class labels, and for this reason is better suited for solving unbalanced protein function prediction problems. We support our claim by showing that a multitask extension of the label propagation algorithm empirically works best when the task relatedness information is represented using a dissimilarity matrix as opposed to a similarity matrix. Moreover, the experimental comparison carried out on three model organism shows that our method has a more stable performance in both "protein-centric" and "function-centric" evaluation settings.

摘要

自动蛋白质功能预测是一个具有独特特征的挑战性问题，例如蛋白质功能的层次结构组织和大多数生物学功能缺乏注释的蛋白质。我们提出了一种多任务学习算法来解决这两个问题。与使用任务（蛋白质功能）相似性信息作为偏向来加速学习的标准多任务算法不同，我们表明，相异性信息强制将稀有类标签与常见类标签分开，因此更适合解决不平衡的蛋白质功能预测问题。我们通过实验表明，在使用相似性矩阵表示任务相关性信息时，标签传播算法的多任务扩展效果最好，从而支持了我们的观点。此外，在三个模型生物上进行的实验比较表明，我们的方法在“以蛋白质为中心”和“以功能为中心”的评估设置中都具有更稳定的性能。

相似文献

Multitask Protein Function Prediction through Task Dissimilarity.通过任务差异进行多任务蛋白质功能预测。

IEEE/ACM Trans Comput Biol Bioinform. 2019 Sep-Oct;16(5):1550-1560. doi: 10.1109/TCBB.2017.2684127. Epub 2017 Mar 17.

Multitask learning for protein subcellular location prediction.基于多任务学习的蛋白质亚细胞位置预测。

IEEE/ACM Trans Comput Biol Bioinform. 2011 May-Jun;8(3):748-59. doi: 10.1109/TCBB.2010.22.

MULTITASK FEATURE SELECTION WITH TASK DESCRIPTORS.带有任务描述符的多任务特征选择

Pac Symp Biocomput. 2016;21:261-72.

Inferring latent task structure for Multitask Learning by Multiple Kernel Learning.通过多核学习推断多任务学习中的潜在任务结构。

BMC Bioinformatics. 2010 Oct 26;11 Suppl 8(Suppl 8):S5. doi: 10.1186/1471-2105-11-S8-S5.

ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval.ProDis-ContSHC：在蛋白质数据库检索中用于蛋白质-蛋白质比较的学习蛋白质非相似性度量和层次上下文一致性。

BMC Bioinformatics. 2012 May 8;13 Suppl 7(Suppl 7):S2. doi: 10.1186/1471-2105-13-S7-S2.

LOTUS: A single- and multitask machine learning algorithm for the prediction of cancer driver genes.LOTUS：一种用于癌症驱动基因预测的单任务和多任务机器学习算法。

PLoS Comput Biol. 2019 Sep 30;15(9):e1007381. doi: 10.1371/journal.pcbi.1007381. eCollection 2019 Sep.

GODoc: high-throughput protein function prediction using novel k-nearest-neighbor and voting algorithms.GODoc：使用新型k近邻和投票算法进行高通量蛋白质功能预测。

BMC Bioinformatics. 2020 Nov 18;21(Suppl 6):276. doi: 10.1186/s12859-020-03556-9.

Sparse Markov chain-based semi-supervised multi-instance multi-label method for protein function prediction.基于稀疏马尔可夫链的半监督多示例多标签蛋白质功能预测方法。

J Bioinform Comput Biol. 2015 Oct;13(5):1543001. doi: 10.1142/S0219720015430015. Epub 2015 Sep 16.

Protein Function Prediction with Incomplete Annotations.利用不完整注释进行蛋白质功能预测。

IEEE/ACM Trans Comput Biol Bioinform. 2014 May-Jun;11(3):579-91. doi: 10.1109/TCBB.2013.142.

Compositional model based on factorial evolution for realizing multi-task learning in bacterial virulent protein prediction.基于因子进化的组合模型在细菌毒力蛋白预测中实现多任务学习。

Artif Intell Med. 2019 Nov;101:101757. doi: 10.1016/j.artmed.2019.101757. Epub 2019 Nov 7.

引用本文的文献

PASS: Protein Annotation Surveillance Site for Protein Annotation Using Homologous Clusters, NLP, and Sequence Similarity Networks.PASS：使用同源簇、自然语言处理和序列相似性网络进行蛋白质注释的蛋白质注释监测站点。

Front Bioinform. 2021 Sep 29;1:749008. doi: 10.3389/fbinf.2021.749008. eCollection 2021.

Functional annotation of creeping bentgrass protein sequences based on convolutional neural network.基于卷积神经网络的匍匐翦股颖蛋白序列功能注释。

BMC Plant Biol. 2022 May 2;22(1):227. doi: 10.1186/s12870-022-03607-8.

annotation of unreviewed acetylcholinesterase (AChE) in some lepidopteran insect pest species reveals the causes of insecticide resistance.对一些鳞翅目害虫物种中未审查的乙酰胆碱酯酶（AChE）的注释揭示了抗药性的原因。

Saudi J Biol Sci. 2021 Apr;28(4):2197-2209. doi: 10.1016/j.sjbs.2021.01.007. Epub 2021 Jan 21.

PFP-WGAN: Protein function prediction by discovering Gene Ontology term correlations with generative adversarial networks.PFP-WGAN：通过生成对抗网络发现与基因本体论术语相关性进行蛋白质功能预测。

PLoS One. 2021 Feb 25;16(2):e0244430. doi: 10.1371/journal.pone.0244430. eCollection 2021.

Predicting Functions of Uncharacterized Human Proteins: From Canonical to Proteoforms.预测人类未知蛋白的功能：从经典到蛋白形式。

Genes (Basel). 2020 Jun 21;11(6):677. doi: 10.3390/genes11060677.

Protein functional annotation of simultaneously improved stability, accuracy and false discovery rate achieved by a sequence-based deep learning.基于序列的深度学习同时提高稳定性、准确性和假阳性率的蛋白质功能注释。

Brief Bioinform. 2020 Jul 15;21(4):1437-1447. doi: 10.1093/bib/bbz081.

A GPU-based algorithm for fast node label learning in large and unbalanced biomolecular networks.一种基于 GPU 的算法，用于快速学习大型不平衡生物分子网络中的节点标签。

BMC Bioinformatics. 2018 Oct 15;19(Suppl 10):353. doi: 10.1186/s12859-018-2301-4.

A novel methodology on distributed representations of proteins using their interacting ligands.一种利用蛋白质相互作用配体进行蛋白质分布表示的新方法。

Bioinformatics. 2018 Jul 1;34(13):i295-i303. doi: 10.1093/bioinformatics/bty287.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过任务差异进行多任务蛋白质功能预测。

Multitask Protein Function Prediction through Task Dissimilarity.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献