普通脱硫弧菌中缺失蛋白质组数据的预测与表征

Prediction and Characterization of Missing Proteomic Data in Desulfovibrio vulgaris.

作者信息

Li Feng, Nie Lei, Wu Gang, Qiao Jianjun, Zhang Weiwen

机构信息

Division of Biometrics II, Office of Biometrics/OTS/CDER/FDA, Silver Spring, MD 20993-0002, USA.

出版信息

Comp Funct Genomics. 2011;2011:780973. doi: 10.1155/2011/780973. Epub 2011 May 4.

DOI:10.1155/2011/780973

PMID:21687592

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3114432/

Abstract

Proteomic datasets are often incomplete due to identification range and sensitivity issues. It becomes important to develop methodologies to estimate missing proteomic data, allowing better interpretation of proteomic datasets and metabolic mechanisms underlying complex biological systems. In this study, we applied an artificial neural network to approximate the relationships between cognate transcriptomic and proteomic datasets of Desulfovibrio vulgaris, and to predict protein abundance for the proteins not experimentally detected, based on several relevant predictors, such as mRNA abundance, cellular role and triple codon counts. The results showed that the coefficients of determination for the trained neural network models ranged from 0.47 to 0.68, providing better modeling than several previous regression models. The validity of the trained neural network model was evaluated using biological information (i.e. operons). To seek understanding of mechanisms causing missing proteomic data, we used a multivariate logistic regression analysis and the result suggested that some key factors, such as protein instability index, aliphatic index, mRNA abundance, effective number of codons (N(c)) and codon adaptation index (CAI) values may be ascribed to whether a given expressed protein can be detected. In addition, we demonstrated that biological interpretation can be improved by use of imputed proteomic datasets.

摘要

由于鉴定范围和灵敏度问题，蛋白质组数据集往往不完整。因此，开发估算缺失蛋白质组数据的方法变得很重要，这有助于更好地解释蛋白质组数据集以及复杂生物系统背后的代谢机制。在本研究中，我们应用人工神经网络来近似普通脱硫弧菌同源转录组和蛋白质组数据集之间的关系，并基于几个相关预测因子（如mRNA丰度、细胞功能和三联密码子计数）预测未通过实验检测到的蛋白质的丰度。结果表明，训练后的神经网络模型的决定系数在0.47至0.68之间，比之前的几个回归模型具有更好的建模效果。使用生物学信息（即操纵子）评估了训练后的神经网络模型的有效性。为了探究导致蛋白质组数据缺失的机制，我们进行了多元逻辑回归分析，结果表明，一些关键因素，如蛋白质不稳定指数、脂肪族指数、mRNA丰度、有效密码子数（N(c)）和密码子适应指数（CAI）值，可能与能否检测到特定表达的蛋白质有关。此外，我们证明了使用插补后的蛋白质组数据集可以改善生物学解释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/986e/3114432/7d4cdd05ed7a/CFG2011-780973.001.jpg

相似文献

Prediction and Characterization of Missing Proteomic Data in Desulfovibrio vulgaris.普通脱硫弧菌中缺失蛋白质组数据的预测与表征

Comp Funct Genomics. 2011;2011:780973. doi: 10.1155/2011/780973. Epub 2011 May 4.

Integrative analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: a non-linear model to predict abundance of undetected proteins.脱硫弧菌转录组和蛋白质组数据的综合分析：预测未检测到蛋白质丰度的非线性模型。

Bioinformatics. 2009 Aug 1;25(15):1905-14. doi: 10.1093/bioinformatics/btp325. Epub 2009 May 15.

Integrated Analysis of Transcriptomic and Proteomic Datasets Reveals Information on Protein Expressivity and Factors Affecting Translational Efficiency.转录组学和蛋白质组学数据集的综合分析揭示了蛋白质表达信息以及影响翻译效率的因素。

Methods Mol Biol. 2016;1375:123-36. doi: 10.1007/7651_2015_242.

Integrative analysis of transcriptomic and proteomic data of Shewanella oneidensis: missing value imputation using temporal datasets.嗜温栖热放线菌转录组学和蛋白质组学数据的综合分析：利用时间数据集进行缺失值插补

Mol Biosyst. 2011 Apr;7(4):1093-104. doi: 10.1039/c0mb00260g. Epub 2011 Jan 7.

Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: zero-inflated Poisson regression models to predict abundance of undetected proteins.普通脱硫弧菌转录组和蛋白质组数据的综合分析：用于预测未检测到的蛋白质丰度的零膨胀泊松回归模型

Bioinformatics. 2006 Jul 1;22(13):1641-7. doi: 10.1093/bioinformatics/btl134. Epub 2006 May 4.

Correlation between mRNA and protein abundance in Desulfovibrio vulgaris: a multiple regression to identify sources of variations.普通脱硫弧菌中mRNA与蛋白质丰度之间的相关性：用于识别变异来源的多元回归分析

Biochem Biophys Res Commun. 2006 Jan 13;339(2):603-10. doi: 10.1016/j.bbrc.2005.11.055. Epub 2005 Nov 17.

Correlation of mRNA expression and protein abundance affected by multiple sequence features related to translational efficiency in Desulfovibrio vulgaris: a quantitative analysis.普通脱硫弧菌中与翻译效率相关的多个序列特征对mRNA表达与蛋白质丰度的影响：定量分析

Genetics. 2006 Dec;174(4):2229-43. doi: 10.1534/genetics.106.065862. Epub 2006 Oct 8.

Engineering Aspects of Olfaction嗅觉的工程学方面

WGCNA Application to Proteomic and Metabolomic Data Analysis.加权基因共表达网络分析在蛋白质组学和代谢组学数据分析中的应用

Methods Enzymol. 2017;585:135-158. doi: 10.1016/bs.mie.2016.09.016. Epub 2016 Dec 15.

Key Metabolites and Mechanistic Changes for Salt Tolerance in an Experimentally Evolved Sulfate-Reducing Bacterium, .硫酸盐还原菌耐盐性的关键代谢物和机制变化的实验研究。

mBio. 2017 Nov 14;8(6):e01780-17. doi: 10.1128/mBio.01780-17.

引用本文的文献

An efficient ensemble method for missing value imputation in microarray gene expression data.一种用于微阵列基因表达数据中缺失值插补的有效集成方法。

BMC Bioinformatics. 2021 Apr 13;22(1):188. doi: 10.1186/s12859-021-04109-4.

A Review of Imputation Strategies for Isobaric Labeling-Based Shotgun Proteomics.基于等压标记的 shotgun 蛋白质组学中填补策略的综述

J Proteome Res. 2021 Jan 1;20(1):1-13. doi: 10.1021/acs.jproteome.0c00123. Epub 2020 Sep 25.

Proteomics and phosphoproteomics in precision medicine: applications and challenges.精准医学中的蛋白质组学和磷酸化蛋白质组学：应用与挑战。

Brief Bioinform. 2019 May 21;20(3):767-777. doi: 10.1093/bib/bbx141.

An integrative imputation method based on multi-omics datasets.一种基于多组学数据集的综合插补方法。

BMC Bioinformatics. 2016 Jun 21;17:247. doi: 10.1186/s12859-016-1122-6.

Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics.基于质谱的无标记全局蛋白质组学中缺失值插补挑战的综述、评估与讨论。

J Proteome Res. 2015 May 1;14(5):1993-2001. doi: 10.1021/pr501138h. Epub 2015 Apr 22.

Integrated analysis of transcriptomic and proteomic data.转录组和蛋白质组数据的综合分析。

Curr Genomics. 2013 Apr;14(2):91-110. doi: 10.2174/1389202911314020003.

本文引用的文献

Dealing with missing values in large-scale studies: microarray data imputation and beyond.处理大规模研究中的缺失值：微阵列数据插补及其他方法。

Brief Bioinform. 2010 Mar;11(2):253-64. doi: 10.1093/bib/bbp059. Epub 2009 Dec 4.

Fifteen years of microbial genomics: meeting the challenges and fulfilling the dream.微生物基因组学的十五年：迎接挑战，实现梦想

Nat Biotechnol. 2009 Jul;27(7):627-32. doi: 10.1038/nbt.1552.

The era of 'omics unlimited.“组学”无限的时代。

Biotechniques. 2009 Apr;46(5):351-2, 354-5. doi: 10.2144/000113137.

Bioinformatics. 2009 Aug 1;25(15):1905-14. doi: 10.1093/bioinformatics/btp325. Epub 2009 May 15.

An introduction to artificial neural networks in bioinformatics--application to complex microarray and mass spectrometry datasets in cancer studies.生物信息学中的人工神经网络介绍——在癌症研究中复杂微阵列和质谱数据集的应用

Brief Bioinform. 2009 May;10(3):315-29. doi: 10.1093/bib/bbp012. Epub 2009 Mar 23.

Advances in analysis of microbial metabolic fluxes via (13)C isotopic labeling.通过（13）C 同位素标记分析微生物代谢通量的研究进展。

Mass Spectrom Rev. 2009 Mar-Apr;28(2):362-75. doi: 10.1002/mas.20191.

Combining genomics, metabolome analysis, and biochemical modelling to understand metabolic networks.结合基因组学、代谢组分析和生化建模来理解代谢网络。

Comp Funct Genomics. 2001;2(3):155-68. doi: 10.1002/cfg.82.

Microbiology in the post-genomic era.后基因组时代的微生物学。

Nat Rev Microbiol. 2008 Jun;6(6):419-30. doi: 10.1038/nrmicro1901. Epub 2008 May 13.

Missing value imputation improves clustering and interpretation of gene expression microarray data.缺失值插补可改善基因表达微阵列数据的聚类和解读。

BMC Bioinformatics. 2008 Apr 18;9:202. doi: 10.1186/1471-2105-9-202.

Treatment of missing values for multivariate statistical analysis of gel-based proteomics data.基于凝胶的蛋白质组学数据多变量统计分析中缺失值的处理

Proteomics. 2008 Apr;8(7):1371-83. doi: 10.1002/pmic.200700975.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

普通脱硫弧菌中缺失蛋白质组数据的预测与表征

Prediction and Characterization of Missing Proteomic Data in Desulfovibrio vulgaris.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献