Suppr超能文献

PhyloPGM:利用进化信息提高调控功能预测准确性。

PhyloPGM: boosting regulatory function prediction accuracy using evolutionary information.

机构信息

School of Computer Science, McGill University, Montreal H3A 0G4, Canada.

出版信息

Bioinformatics. 2022 Jun 24;38(Suppl 1):i299-i306. doi: 10.1093/bioinformatics/btac259.

Abstract

MOTIVATION

The computational prediction of regulatory function associated with a genomic sequence is of utter importance in -omics study, which facilitates our understanding of the underlying mechanisms underpinning the vast gene regulatory network. Prominent examples in this area include the binding prediction of transcription factors in DNA regulatory regions, and predicting RNA-protein interaction in the context of post-transcriptional gene expression. However, existing computational methods have suffered from high false-positive rates and have seldom used any evolutionary information, despite the vast amount of available orthologous data across multitudes of extant and ancestral genomes, which readily present an opportunity to improve the accuracy of existing computational methods.

RESULTS

In this study, we present a novel probabilistic approach called PhyloPGM that leverages previously trained TFBS or RNA-RBP binding predictors by aggregating their predictions from various orthologous regions, in order to boost the overall prediction accuracy on human sequences. Throughout our experiments, PhyloPGM has shown significant improvement over baselines such as the sequence-based RNA-RBP binding predictor RNATracker and the sequence-based TFBS predictor that is known as FactorNet. PhyloPGM is simple in principle, easy to implement and yet, yields impressive results.

AVAILABILITY AND IMPLEMENTATION

The PhyloPGM package is available at https://github.com/BlanchetteLab/PhyloPGM.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在组学研究中,与基因组序列相关的调控功能的计算预测至关重要,这有助于我们理解庞大的基因调控网络背后的潜在机制。该领域的突出例子包括 DNA 调控区域中转录因子的结合预测,以及在后转录基因表达的情况下预测 RNA-蛋白质相互作用。然而,现有的计算方法存在高假阳性率的问题,并且很少利用任何进化信息,尽管在众多现存和祖先基因组中都有大量的同源数据,这为提高现有计算方法的准确性提供了机会。

结果

在这项研究中,我们提出了一种名为 PhyloPGM 的新概率方法,该方法通过从各种同源区域聚合先前训练的 TFBS 或 RNA-RBP 结合预测器的预测,从而提高了对人类序列的整体预测准确性。在我们的实验中,PhyloPGM 与基线相比有显著的改进,例如基于序列的 RNA-RBP 结合预测器 RNATracker 和基于序列的 TFBS 预测器 FactorNet。PhyloPGM 在原理上简单,易于实现,但却取得了令人印象深刻的结果。

可用性和实现

PhyloPGM 包可在 https://github.com/BlanchetteLab/PhyloPGM 上获得。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/c6aa90823073/btac259f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验