Suppr超能文献

利用 mRNA 可及性提高大肠杆菌中蛋白质丰度的预测准确性。

Improving the prediction accuracy of protein abundance in Escherichia coli using mRNA accessibility.

机构信息

Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Japan.

出版信息

Nucleic Acids Res. 2020 Aug 20;48(14):e81. doi: 10.1093/nar/gkaa481.

Abstract

RNA secondary structure around translation initiation sites strongly affects the abundance of expressed proteins in Escherichia coli. However, detailed secondary structural features governing protein abundance remain elusive. Recent advances in high-throughput DNA synthesis and experimental systems enable us to obtain large amounts of data. Here, we evaluated six types of structural features using two large-scale datasets. We found that accessibility, which is the probability that a given region around the start codon has no base-paired nucleotides, showed the highest correlation with protein abundance in both datasets. Accessibility showed a significantly higher correlation (Spearman's ρ = 0.709) than the widely used minimum free energy (0.554) in one of the datasets. Interestingly, accessibility showed the highest correlation only when it was calculated by a log-linear model, indicating that the RNA structural model and how to utilize it are important. Furthermore, by combining the accessibility and activity of the Shine-Dalgarno sequence, we devised a method for predicting protein abundance more accurately than existing methods. We inferred that the log-linear model has a broader probabilistic distribution than the widely used Turner energy model, which contributed to more accurate quantification of ribosome accessibility to translation initiation sites.

摘要

在翻译起始位点周围的 RNA 二级结构强烈影响大肠杆菌中表达蛋白的丰度。然而,控制蛋白丰度的详细二级结构特征仍然难以捉摸。高通量 DNA 合成和实验系统的最新进展使我们能够获得大量数据。在这里,我们使用两个大型数据集评估了六种结构特征。我们发现,在两个数据集中,起始密码子周围区域没有碱基配对核苷酸的可能性(即可及性)与蛋白丰度的相关性最高。在其中一个数据集,可及性的相关性(Spearman's ρ = 0.709)显著高于广泛使用的最小自由能(0.554)。有趣的是,只有当通过对数线性模型计算时,可及性才显示出最高的相关性,这表明 RNA 结构模型及其使用方式很重要。此外,通过结合 Shine-Dalgarno 序列的可及性和活性,我们设计了一种方法,比现有方法更准确地预测蛋白丰度。我们推断对数线性模型具有比广泛使用的 Turner 能量模型更广泛的概率分布,这有助于更准确地量化核糖体对翻译起始位点的可及性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6e6/7641306/a2d5b7f470c1/gkaa481fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验