Suppr超能文献

pyRBDome:一个用于增强 RNA 结合蛋白组学数据的综合性计算平台。

pyRBDome: a comprehensive computational platform for enhancing RNA-binding proteome data.

机构信息

https://ror.org/01nrxwf90 Centre for Engineering Biology, University of Edinburgh, Edinburgh, UK.

https://ror.org/01nrxwf90 Institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, UK.

出版信息

Life Sci Alliance. 2024 Jul 30;7(10). doi: 10.26508/lsa.202402787. Print 2024 Oct.

Abstract

High-throughput proteomics approaches have revolutionised the identification of RNA-binding proteins (RBPome) and RNA-binding sequences (RBDome) across organisms. Yet, the extent of noise, including false positives, associated with these methodologies, is difficult to quantify as experimental approaches for validating the results are generally low throughput. To address this, we introduce pyRBDome, a pipeline for enhancing RNA-binding proteome data in silico. It aligns the experimental results with RNA-binding site (RBS) predictions from distinct machine-learning tools and integrates high-resolution structural data when available. Its statistical evaluation of RBDome data enables quick identification of likely genuine RNA-binders in experimental datasets. Furthermore, by leveraging the pyRBDome results, we have enhanced the sensitivity and specificity of RBS detection through training new ensemble machine-learning models. pyRBDome analysis of a human RBDome dataset, compared with known structural data, revealed that although UV-cross-linked amino acids were more likely to contain predicted RBSs, they infrequently bind RNA in high-resolution structures. This discrepancy underscores the limitations of structural data as benchmarks, positioning pyRBDome as a valuable alternative for increasing confidence in RBDome datasets.

摘要

高通量蛋白质组学方法已经彻底改变了在不同生物体中鉴定 RNA 结合蛋白 (RBPome) 和 RNA 结合序列 (RBDome) 的方式。然而,由于验证这些结果的实验方法通常通量较低,因此很难量化与这些方法相关的噪声程度,包括假阳性。为了解决这个问题,我们引入了 pyRBDome,这是一种用于在计算机上增强 RNA 结合蛋白质组数据的管道。它将实验结果与来自不同机器学习工具的 RNA 结合位点 (RBS) 预测对齐,并在有高分辨率结构数据时进行整合。它对 RBDome 数据的统计评估可以快速识别实验数据集中可能的真正 RNA 结合物。此外,通过利用 pyRBDome 的结果,我们通过训练新的集成机器学习模型来提高 RBS 检测的灵敏度和特异性。与已知结构数据相比,对人类 RBDome 数据集进行的 pyRBDome 分析表明,尽管紫外线交联的氨基酸更有可能包含预测的 RBS,但它们在高分辨率结构中很少与 RNA 结合。这种差异突显了结构数据作为基准的局限性,使 pyRBDome 成为增加 RBDome 数据集可信度的一种有价值的替代方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc9c/11289467/2f4f207d9895/LSA-2024-02787_Fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验