普通脱硫弧菌转录组和蛋白质组数据的综合分析：用于预测未检测到的蛋白质丰度的零膨胀泊松回归模型

Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: zero-inflated Poisson regression models to predict abundance of undetected proteins.

作者信息

Nie Lei, Wu Gang, Brockman Fred J, Zhang Weiwen

机构信息

Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University Washington DC 20057, USA.

出版信息

Bioinformatics. 2006 Jul 1;22(13):1641-7. doi: 10.1093/bioinformatics/btl134. Epub 2006 May 4.

DOI:10.1093/bioinformatics/btl134

PMID:16675466

Abstract

MOTIVATION

Integrated analysis of global scale transcriptomic and proteomic data can provide important insights into the metabolic mechanisms underlying complex biological systems. However, because the relationship between protein abundance and mRNA expression level is complicated by many cellular and physical processes, sophisticated statistical models need to be developed to capture their relationship.

RESULTS

In this study, we describe a novel data-driven statistical model to integrate whole-genome microarray and proteomic data collected from Desulfovibrio vulgaris grown under three different conditions. Based on the Poisson distribution pattern of proteomic data and the fact that a large number of proteins were undetected (excess zeros), zero-inflated Poisson (ZIP)-based models were proposed to define the correlation pattern between mRNA and protein abundance. In addition, by assuming that there is a probability mass at zero representing unexpressed genes and expressed proteins that were undetected owing to technical limitations, a Potential ZIP model was established. Two significant improvements introduced by this approach are (1) the predicted protein abundance level values for experimentally detected proteins are corrected by considering their mRNA levels and (2) protein abundance values can be predicted for undetected proteins (in the case of this study, approximately 83% of the proteins in the D.vulgaris genome) for better biological interpretation. We demonstrated the use of these statistical models by comparatively analyzing proteomic and microarray results from D.vulgaris grown on lactate-based versus formate-based media. These models correctly predicted increased expression of Ech hydrogenase and decreased expression of Coo hydrogenase for D.vulgaris grown on formate.

摘要

动机

对全球规模的转录组学和蛋白质组学数据进行综合分析，可以为复杂生物系统潜在的代谢机制提供重要见解。然而，由于蛋白质丰度与mRNA表达水平之间的关系受到许多细胞和物理过程的影响而变得复杂，因此需要开发复杂的统计模型来捕捉它们之间的关系。

结果

在本研究中，我们描述了一种新型的数据驱动统计模型，用于整合从在三种不同条件下生长的普通脱硫弧菌收集的全基因组微阵列和蛋白质组学数据。基于蛋白质组学数据的泊松分布模式以及大量蛋白质未被检测到（过多零值）这一事实，提出了基于零膨胀泊松（ZIP）的模型来定义mRNA与蛋白质丰度之间的相关模式。此外，通过假设在零处存在一个概率质量，代表由于技术限制未被检测到的未表达基因和已表达蛋白质，建立了潜在ZIP模型。该方法引入的两个显著改进是：（1）通过考虑其mRNA水平对实验检测到的蛋白质的预测蛋白质丰度水平值进行校正；（2）可以预测未检测到的蛋白质的蛋白质丰度值（在本研究中，普通脱硫弧菌基因组中约83%的蛋白质），以便进行更好的生物学解释。我们通过比较分析在基于乳酸盐和基于甲酸盐的培养基上生长的普通脱硫弧菌的蛋白质组学和微阵列结果，展示了这些统计模型的应用。这些模型正确地预测了在甲酸盐上生长的普通脱硫弧菌中Ech氢化酶表达增加和Coo氢化酶表达减少。

相似文献

Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: zero-inflated Poisson regression models to predict abundance of undetected proteins.普通脱硫弧菌转录组和蛋白质组数据的综合分析：用于预测未检测到的蛋白质丰度的零膨胀泊松回归模型

Bioinformatics. 2006 Jul 1;22(13):1641-7. doi: 10.1093/bioinformatics/btl134. Epub 2006 May 4.

Integrated Analysis of Transcriptomic and Proteomic Datasets Reveals Information on Protein Expressivity and Factors Affecting Translational Efficiency.转录组学和蛋白质组学数据集的综合分析揭示了蛋白质表达信息以及影响翻译效率的因素。

Methods Mol Biol. 2016;1375:123-36. doi: 10.1007/7651_2015_242.

Correlation between mRNA and protein abundance in Desulfovibrio vulgaris: a multiple regression to identify sources of variations.普通脱硫弧菌中mRNA与蛋白质丰度之间的相关性：用于识别变异来源的多元回归分析

Biochem Biophys Res Commun. 2006 Jan 13;339(2):603-10. doi: 10.1016/j.bbrc.2005.11.055. Epub 2005 Nov 17.

Integrative analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: a non-linear model to predict abundance of undetected proteins.脱硫弧菌转录组和蛋白质组数据的综合分析：预测未检测到蛋白质丰度的非线性模型。

Bioinformatics. 2009 Aug 1;25(15):1905-14. doi: 10.1093/bioinformatics/btp325. Epub 2009 May 15.

A proteomic view of Desulfovibrio vulgaris metabolism as determined by liquid chromatography coupled with tandem mass spectrometry.通过液相色谱-串联质谱法测定的普通脱硫弧菌代谢的蛋白质组学视角。

Proteomics. 2006 Aug;6(15):4286-99. doi: 10.1002/pmic.200500930.

LC-MS/MS based proteomic analysis and functional inference of hypothetical proteins in Desulfovibrio vulgaris.基于液相色谱-串联质谱的普通脱硫弧菌中假定蛋白质的蛋白质组学分析及功能推断

Biochem Biophys Res Commun. 2006 Nov 3;349(4):1412-9. doi: 10.1016/j.bbrc.2006.09.019. Epub 2006 Sep 15.

Correlation of mRNA expression and protein abundance affected by multiple sequence features related to translational efficiency in Desulfovibrio vulgaris: a quantitative analysis.普通脱硫弧菌中与翻译效率相关的多个序列特征对mRNA表达与蛋白质丰度的影响：定量分析

Genetics. 2006 Dec;174(4):2229-43. doi: 10.1534/genetics.106.065862. Epub 2006 Oct 8.

Energy metabolism in Desulfovibrio vulgaris Hildenborough: insights from transcriptome analysis.希登伯勒脱硫弧菌中的能量代谢：转录组分析的见解

Antonie Van Leeuwenhoek. 2008 May;93(4):347-62. doi: 10.1007/s10482-007-9212-0. Epub 2007 Dec 1.

Post-translational modifications of Desulfovibrio vulgaris Hildenborough sulfate reduction pathway proteins.普通脱硫弧菌希登伯勒硫酸盐还原途径蛋白的翻译后修饰

J Proteome Res. 2008 Jun;7(6):2320-31. doi: 10.1021/pr700772s. Epub 2008 Apr 17.

Global transcriptomic analysis of Desulfovibrio vulgaris on different electron donors.不同电子供体条件下普通脱硫弧菌的全转录组分析

Antonie Van Leeuwenhoek. 2006 Feb;89(2):221-37. doi: 10.1007/s10482-005-9024-z. Epub 2006 May 5.

引用本文的文献

The Effect of Recombinant Protein Production in Transcriptome and Proteome.重组蛋白生产对转录组和蛋白质组的影响。

Microorganisms. 2022 Jan 25;10(2):267. doi: 10.3390/microorganisms10020267.

Delayed Protein Changes During Seed Germination.种子萌发过程中的蛋白质变化延迟

Front Plant Sci. 2021 Sep 15;12:735719. doi: 10.3389/fpls.2021.735719. eCollection 2021.

An efficient ensemble method for missing value imputation in microarray gene expression data.一种用于微阵列基因表达数据中缺失值插补的有效集成方法。

BMC Bioinformatics. 2021 Apr 13;22(1):188. doi: 10.1186/s12859-021-04109-4.

Integrative transcriptomic and proteomic analysis of the mutant lignocellulosic hydrolyzate-tolerant .耐突变木质纤维素水解产物的转录组学和蛋白质组学综合分析

Eng Life Sci. 2016 Mar 29;17(3):249-261. doi: 10.1002/elsc.201500143. eCollection 2017 Mar.

Microbiome Multi-Omics Network Analysis: Statistical Considerations, Limitations, and Opportunities.微生物组多组学网络分析：统计考量、局限性与机遇

Front Genet. 2019 Nov 8;10:995. doi: 10.3389/fgene.2019.00995. eCollection 2019.

SDA: a semi-parametric differential abundance analysis method for metabolomics and proteomics data.SDA：一种用于代谢组学和蛋白质组学数据的半参数差异丰度分析方法。

BMC Bioinformatics. 2019 Oct 17;20(1):501. doi: 10.1186/s12859-019-3067-z.

Integration of transcriptomic and proteomic data identifies biological functions in cell populations from human infant lung.转录组学和蛋白质组学数据的整合鉴定了人类婴儿肺细胞群体中的生物学功能。

Am J Physiol Lung Cell Mol Physiol. 2019 Sep 1;317(3):L347-L360. doi: 10.1152/ajplung.00475.2018. Epub 2019 Jul 3.

An integrative imputation method based on multi-omics datasets.一种基于多组学数据集的综合插补方法。

BMC Bioinformatics. 2016 Jun 21;17:247. doi: 10.1186/s12859-016-1122-6.

Integrative omics analysis. A study based on Plasmodium falciparum mRNA and protein data.整合组学分析。一项基于恶性疟原虫mRNA和蛋白质数据的研究。

BMC Syst Biol. 2014;8 Suppl 2(Suppl 2):S4. doi: 10.1186/1752-0509-8-S2-S4. Epub 2014 Mar 13.

A Bayesian Semi-parametric Approach for the Differential Analysis of Sequence Counts Data.一种用于序列计数数据差异分析的贝叶斯半参数方法。

J R Stat Soc Ser C Appl Stat. 2014 Apr;63(3):385-404. doi: 10.1111/rssc.12041.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

普通脱硫弧菌转录组和蛋白质组数据的综合分析：用于预测未检测到的蛋白质丰度的零膨胀泊松回归模型

Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: zero-inflated Poisson regression models to predict abundance of undetected proteins.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

动机

结果

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献