纠正分布预测中的错误。

Correcting mistakes in predicting distributions.

机构信息

Department of Informatics, l12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Garching/Munich, Germany.

出版信息

Bioinformatics. 2018 Oct 1;34(19):3385-3386. doi: 10.1093/bioinformatics/bty346.

DOI:10.1093/bioinformatics/bty346

PMID:29762646

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6157078/

Abstract

MOTIVATION

Many applications monitor predictions of a whole range of features for biological datasets, e.g. the fraction of secreted human proteins in the human proteome. Results and error estimates are typically derived from publications.

RESULTS

Here, we present a simple, alternative approximation that uses performance estimates of methods to error-correct the predicted distributions. This approximation uses the confusion matrix (TP true positives, TN true negatives, FP false positives and FN false negatives) describing the performance of the prediction tool for correction. As proof-of-principle, the correction was applied to a two-class (membrane/not) and to a seven-class (localization) prediction.

AVAILABILITY AND IMPLEMENTATION

Datasets and a simple JavaScript tool available freely for all users at http://www.rostlab.org/services/distributions.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

许多应用程序会监测生物数据集的一系列特征的预测，例如人类蛋白质组中分泌的人类蛋白质的分数。结果和误差估计通常来自出版物。

结果

在这里，我们提出了一种简单的替代方法，该方法使用方法的性能估计来错误纠正预测分布。该方法使用混淆矩阵（TP 真阳性、TN 真阴性、FP 假阳性和 FN 假阴性）来描述预测工具的校正性能。作为原理验证，校正应用于两个类（膜/非膜）和七个类（定位）的预测。

可用性和实现

数据集和一个简单的 JavaScript 工具可在 http://www.rostlab.org/services/distributions 上免费提供给所有用户。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9b4/6157078/f00ef4a3166c/bty346f1.jpg

相似文献

Correcting mistakes in predicting distributions.纠正分布预测中的错误。

Bioinformatics. 2018 Oct 1;34(19):3385-3386. doi: 10.1093/bioinformatics/bty346.

GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome.糖基分析软件（GlycoMine）：一种基于机器学习的方法，用于预测人类蛋白质组中的 N-、C-和 O-糖基化。

Bioinformatics. 2015 May 1;31(9):1411-9. doi: 10.1093/bioinformatics/btu852. Epub 2015 Jan 6.

A word of caution about biological inference - Revisiting cysteine covalent state predictions.关于生物推断的一个警告——重新审视半胱氨酸的共价态预测。

FEBS Open Bio. 2014 Mar 12;4:310-4. doi: 10.1016/j.fob.2014.03.003. eCollection 2014.

MetalPredator: a web server to predict iron-sulfur cluster binding proteomes.MetalPredator：用于预测铁硫簇结合蛋白质组的网络服务器。

Bioinformatics. 2016 Sep 15;32(18):2850-2. doi: 10.1093/bioinformatics/btw238. Epub 2016 Jun 6.

Predicting regulatory variants with composite statistic.使用复合统计量预测调控变异体。

Bioinformatics. 2016 Sep 15;32(18):2729-36. doi: 10.1093/bioinformatics/btw288. Epub 2016 Jun 6.

On the beta-binomial model for analysis of spectral count data in label-free tandem mass spectrometry-based proteomics.基于无标记串联质谱蛋白质组学的谱计数数据的双二项式模型分析。

Bioinformatics. 2010 Feb 1;26(3):363-9. doi: 10.1093/bioinformatics/btp677. Epub 2009 Dec 9.

Fast and flexible coarse-grained prediction of protein folding routes using ensemble modeling and evolutionary sequence variation.利用集成建模和进化序列变异快速灵活地进行蛋白质折叠途径的粗粒预测。

Bioinformatics. 2020 Mar 1;36(5):1420-1428. doi: 10.1093/bioinformatics/btz743.

A3D database: structure-based predictions of protein aggregation for the human proteome.A3D 数据库：基于结构的人类蛋白质组蛋白聚集预测。

Bioinformatics. 2022 May 26;38(11):3121-3123. doi: 10.1093/bioinformatics/btac215.

Hum-mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features.Hum-mPLoc 3.0：通过对基因本体和功能域特征的隐藏相关性进行建模来增强人类蛋白质亚细胞定位预测

Bioinformatics. 2017 Mar 15;33(6):843-853. doi: 10.1093/bioinformatics/btw723.

SVD-phy: improved prediction of protein functional associations through singular value decomposition of phylogenetic profiles.SVD-phy：通过系统发育谱的奇异值分解改进蛋白质功能关联预测

Bioinformatics. 2016 Apr 1;32(7):1085-7. doi: 10.1093/bioinformatics/btv696. Epub 2015 Nov 26.

引用本文的文献

Spectrum of Protein Location in Proteomes Captures Evolutionary Relationship Between Species.蛋白质组中蛋白质位置的分布范围揭示了物种间的进化关系。

J Mol Evol. 2021 Oct;89(8):544-553. doi: 10.1007/s00239-021-10022-4. Epub 2021 Jul 30.

Detecting sequence signals in targeting peptides using deep learning.利用深度学习检测靶向肽中的序列信号。

Life Sci Alliance. 2019 Sep 30;2(5). doi: 10.26508/lsa.201900429. Print 2019 Oct.

Detailed prediction of protein sub-nuclear localization.详细预测蛋白质亚核定位。

BMC Bioinformatics. 2019 Apr 23;20(1):205. doi: 10.1186/s12859-019-2790-9.

本文引用的文献

A subcellular map of the human proteome.人类蛋白质组的亚细胞图谱。

Science. 2017 May 26;356(6340). doi: 10.1126/science.aal3321. Epub 2017 May 11.

Bioinformatics. 2017 Mar 15;33(6):843-853. doi: 10.1093/bioinformatics/btw723.

Evaluation of transmembrane helix predictions in 2014.2014 年跨膜螺旋预测评估。

Proteins. 2015 Mar;83(3):473-84. doi: 10.1002/prot.24749. Epub 2015 Jan 22.

LocTree2 predicts localization for all domains of life.LocTree2 可预测所有生命领域的定位。

Bioinformatics. 2012 Sep 15;28(18):i458-i465. doi: 10.1093/bioinformatics/bts390.

MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction.MultiLoc2：整合系统发育和基因本体论术语可提高亚细胞蛋白质定位预测。

BMC Bioinformatics. 2009 Sep 1;10:274. doi: 10.1186/1471-2105-10-274.

Transmembrane helix predictions revisited.跨膜螺旋预测再探讨。

Protein Sci. 2002 Dec;11(12):2774-91. doi: 10.1110/ps.0214502.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

纠正分布预测中的错误。

Correcting mistakes in predicting distributions.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献