• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PhyloPGM:利用进化信息提高调控功能预测准确性。

PhyloPGM: boosting regulatory function prediction accuracy using evolutionary information.

机构信息

School of Computer Science, McGill University, Montreal H3A 0G4, Canada.

出版信息

Bioinformatics. 2022 Jun 24;38(Suppl 1):i299-i306. doi: 10.1093/bioinformatics/btac259.

DOI:10.1093/bioinformatics/btac259
PMID:35758792
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9235490/
Abstract

MOTIVATION

The computational prediction of regulatory function associated with a genomic sequence is of utter importance in -omics study, which facilitates our understanding of the underlying mechanisms underpinning the vast gene regulatory network. Prominent examples in this area include the binding prediction of transcription factors in DNA regulatory regions, and predicting RNA-protein interaction in the context of post-transcriptional gene expression. However, existing computational methods have suffered from high false-positive rates and have seldom used any evolutionary information, despite the vast amount of available orthologous data across multitudes of extant and ancestral genomes, which readily present an opportunity to improve the accuracy of existing computational methods.

RESULTS

In this study, we present a novel probabilistic approach called PhyloPGM that leverages previously trained TFBS or RNA-RBP binding predictors by aggregating their predictions from various orthologous regions, in order to boost the overall prediction accuracy on human sequences. Throughout our experiments, PhyloPGM has shown significant improvement over baselines such as the sequence-based RNA-RBP binding predictor RNATracker and the sequence-based TFBS predictor that is known as FactorNet. PhyloPGM is simple in principle, easy to implement and yet, yields impressive results.

AVAILABILITY AND IMPLEMENTATION

The PhyloPGM package is available at https://github.com/BlanchetteLab/PhyloPGM.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在组学研究中,与基因组序列相关的调控功能的计算预测至关重要,这有助于我们理解庞大的基因调控网络背后的潜在机制。该领域的突出例子包括 DNA 调控区域中转录因子的结合预测,以及在后转录基因表达的情况下预测 RNA-蛋白质相互作用。然而,现有的计算方法存在高假阳性率的问题,并且很少利用任何进化信息,尽管在众多现存和祖先基因组中都有大量的同源数据,这为提高现有计算方法的准确性提供了机会。

结果

在这项研究中,我们提出了一种名为 PhyloPGM 的新概率方法,该方法通过从各种同源区域聚合先前训练的 TFBS 或 RNA-RBP 结合预测器的预测,从而提高了对人类序列的整体预测准确性。在我们的实验中,PhyloPGM 与基线相比有显著的改进,例如基于序列的 RNA-RBP 结合预测器 RNATracker 和基于序列的 TFBS 预测器 FactorNet。PhyloPGM 在原理上简单,易于实现,但却取得了令人印象深刻的结果。

可用性和实现

PhyloPGM 包可在 https://github.com/BlanchetteLab/PhyloPGM 上获得。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/d14698444696/btac259f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/c6aa90823073/btac259f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/dc95c9e7aea0/btac259f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/7f9c6e5ba654/btac259f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/c8fb9591c754/btac259f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/d14698444696/btac259f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/c6aa90823073/btac259f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/dc95c9e7aea0/btac259f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/7f9c6e5ba654/btac259f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/c8fb9591c754/btac259f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ea/9235490/d14698444696/btac259f5.jpg

相似文献

1
PhyloPGM: boosting regulatory function prediction accuracy using evolutionary information.PhyloPGM:利用进化信息提高调控功能预测准确性。
Bioinformatics. 2022 Jun 24;38(Suppl 1):i299-i306. doi: 10.1093/bioinformatics/btac259.
2
Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks.利用样条变换对调控序列的位置效应进行建模可提高深度神经网络的预测准确性。
Bioinformatics. 2018 Apr 15;34(8):1261-1269. doi: 10.1093/bioinformatics/btx727.
3
Exploiting ancestral mammalian genomes for the prediction of human transcription factor binding sites.利用祖先哺乳动物基因组预测人类转录因子结合位点。
BMC Bioinformatics. 2012;13 Suppl 19(Suppl 19):S2. doi: 10.1186/1471-2105-13-S19-S2. Epub 2012 Dec 19.
4
Predicting gene regulatory regions with a convolutional neural network for processing double-strand genome sequence information.利用卷积神经网络处理双链基因组序列信息来预测基因调控区域。
PLoS One. 2020 Jul 23;15(7):e0235748. doi: 10.1371/journal.pone.0235748. eCollection 2020.
5
Chromatin accessibility prediction via a hybrid deep convolutional neural network.基于混合深度卷积神经网络的染色质可及性预测。
Bioinformatics. 2018 Mar 1;34(5):732-738. doi: 10.1093/bioinformatics/btx679.
6
annotatr: genomic regions in context.Annotatr:关联中的基因组区域。
Bioinformatics. 2017 Aug 1;33(15):2381-2383. doi: 10.1093/bioinformatics/btx183.
7
BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone.比人:仅使用 DNA 序列通过深度学习模型预测增强子。
Bioinformatics. 2017 Jul 1;33(13):1930-1936. doi: 10.1093/bioinformatics/btx105.
8
Discovering epistatic feature interactions from neural network models of regulatory DNA sequences.从调控 DNA 序列的神经网络模型中发现上位特征相互作用。
Bioinformatics. 2018 Sep 1;34(17):i629-i637. doi: 10.1093/bioinformatics/bty575.
9
BinDNase: a discriminatory approach for transcription factor binding prediction using DNase I hypersensitivity data.BinDNase:一种利用DNA酶I超敏反应数据进行转录因子结合预测的鉴别方法。
Bioinformatics. 2015 Sep 1;31(17):2852-9. doi: 10.1093/bioinformatics/btv294. Epub 2015 May 7.
10
seqgra: principled selection of neural network architectures for genomics prediction tasks.seqgra:用于基因组学预测任务的神经网络体系结构的有原则选择。
Bioinformatics. 2022 Apr 28;38(9):2381-2388. doi: 10.1093/bioinformatics/btac101.

引用本文的文献

1
Graphylo: A deep learning approach for predicting regulatory DNA and RNA sites from whole-genome multiple alignments.Graphylo:一种用于从全基因组多序列比对中预测调控DNA和RNA位点的深度学习方法。
iScience. 2024 Jan 26;27(2):109002. doi: 10.1016/j.isci.2024.109002. eCollection 2024 Feb 16.

本文引用的文献

1
ncVarDB: a manually curated database for pathogenic non-coding variants and benign controls.ncVarDB:一个手动整理的致病性非编码变异体和良性对照数据库。
Database (Oxford). 2020 Dec 1;2020. doi: 10.1093/database/baaa105.
2
The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction.重新审视直系同源推断假说:直系同源物和旁系同源物在功能预测中的价值。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i219-i226. doi: 10.1093/bioinformatics/btaa468.
3
Ranking of non-coding pathogenic variants and putative essential regions of the human genome.
人类基因组中非编码致病性变异体和推定必需区域的排名。
Nat Commun. 2019 Nov 20;10(1):5241. doi: 10.1038/s41467-019-13212-3.
4
Prediction of mRNA subcellular localization using deep recurrent neural networks.基于深度递归神经网络的 mRNA 亚细胞定位预测。
Bioinformatics. 2019 Jul 15;35(14):i333-i342. doi: 10.1093/bioinformatics/btz337.
5
DeepGOPlus: improved protein function prediction from sequence.DeepGOPlus:从序列中改进蛋白质功能预测。
Bioinformatics. 2020 Jan 15;36(2):422-429. doi: 10.1093/bioinformatics/btz595.
6
FactorNet: A deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data.FactorNet:一种从核苷酸分辨率序列数据预测细胞类型特异性转录因子结合的深度学习框架。
Methods. 2019 Aug 15;166:40-47. doi: 10.1016/j.ymeth.2019.03.020. Epub 2019 Mar 26.
7
Functional Dissection of the Enhancer Repertoire in Human Embryonic Stem Cells.人类胚胎干细胞中增强子谱的功能剖析。
Cell Stem Cell. 2018 Aug 2;23(2):276-288.e8. doi: 10.1016/j.stem.2018.06.014. Epub 2018 Jul 19.
8
Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks.使用深度卷积和递归神经网络预测 RNA-蛋白质序列和结构的结合偏好。
BMC Genomics. 2018 Jul 3;19(1):511. doi: 10.1186/s12864-018-4889-1.
9
RNA-mediated gene regulation is less evolvable than transcriptional regulation.RNA 介导的基因调控比转录调控的进化能力更低。
Proc Natl Acad Sci U S A. 2018 Apr 10;115(15):E3481-E3490. doi: 10.1073/pnas.1719138115. Epub 2018 Mar 26.
10
RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach.基于新型混合深度学习跨域知识整合方法的RNA-蛋白质结合基序挖掘
BMC Bioinformatics. 2017 Feb 28;18(1):136. doi: 10.1186/s12859-017-1561-8.