Mehdi Ahmed M, Patrick Ralph, Bailey Timothy L, Bodén Mikael
Institute for Molecular Bioscience, The University of Queensland, Brisbane, 4072, Australia;
Mol Cell Proteomics. 2014 May;13(5):1330-40. doi: 10.1074/mcp.M113.033076. Epub 2014 Feb 16.
Protein synthesis is finely regulated across all organisms, from bacteria to humans, and its integrity underpins many important processes. Emerging evidence suggests that the dynamic range of protein abundance is greater than that observed at the transcript level. Technological breakthroughs now mean that sequencing-based measurement of mRNA levels is routine, but protocols for measuring protein abundance remain both complex and expensive. This paper introduces a Bayesian network that integrates transcriptomic and proteomic data to predict protein abundance and to model the effects of its determinants. We aim to use this model to follow a molecular response over time, from condition-specific data, in order to understand adaptation during processes such as the cell cycle. With microarray data now available for many conditions, the general utility of a protein abundance predictor is broad. Whereas most quantitative proteomics studies have focused on higher organisms, we developed a predictive model of protein abundance for both Saccharomyces cerevisiae and Schizosaccharomyces pombe to explore the latitude at the protein level. Our predictor primarily relies on mRNA level, mRNA-protein interaction, mRNA folding energy and half-life, and tRNA adaptation. The combination of key features, allowing for the low certainty and uneven coverage of experimental observations, gives comparatively minor but robust prediction accuracy. The model substantially improved the analysis of protein regulation during the cell cycle: predicted protein abundance identified twice as many cell-cycle-associated proteins as experimental mRNA levels. Predicted protein abundance was more dynamic than observed mRNA expression, agreeing with experimental protein abundance from a human cell line. We illustrate how the same model can be used to predict the folding energy of mRNA when protein abundance is available, lending credence to the emerging view that mRNA folding affects translation efficiency. The software and data used in this research are available at http://bioinf.scmb.uq.edu.au/proteinabundance/.
从细菌到人类,蛋白质合成在所有生物体中都受到精细调控,其完整性支撑着许多重要过程。新出现的证据表明,蛋白质丰度的动态范围大于转录水平上观察到的范围。技术突破使得基于测序的mRNA水平测量成为常规操作,但测量蛋白质丰度的方法仍然既复杂又昂贵。本文介绍了一种贝叶斯网络,该网络整合转录组学和蛋白质组学数据来预测蛋白质丰度并模拟其决定因素的影响。我们旨在使用这个模型,根据特定条件的数据,随时间追踪分子反应,以便理解细胞周期等过程中的适应性。由于现在有许多条件下的微阵列数据,蛋白质丰度预测器具有广泛的通用性。尽管大多数定量蛋白质组学研究都集中在高等生物上,但我们为酿酒酵母和裂殖酵母开发了蛋白质丰度预测模型,以探索蛋白质水平上的变化范围。我们的预测器主要依赖于mRNA水平、mRNA-蛋白质相互作用、mRNA折叠能量和半衰期以及tRNA适应性。关键特征的组合,考虑到实验观察的低确定性和不均匀覆盖,给出了相对较小但稳健的预测准确性。该模型显著改进了细胞周期中蛋白质调控的分析:预测的蛋白质丰度识别出的细胞周期相关蛋白质是实验mRNA水平的两倍。预测的蛋白质丰度比观察到的mRNA表达更具动态性,这与人类细胞系的实验蛋白质丰度一致。我们说明了当蛋白质丰度可用时,如何使用同一个模型来预测mRNA的折叠能量,这支持了mRNA折叠影响翻译效率这一新兴观点。本研究中使用的软件和数据可在http://bioinf.scmb.uq.edu.au/proteinabundance/获取。