• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过对染色质特征模式的概率建模来预测人类基因组中的增强子。

Enhancer prediction in the human genome by probabilistic modelling of the chromatin feature patterns.

机构信息

Department of Computer Science, Aalto University, Konemiehentie 2, Espoo, 02150, Finland.

出版信息

BMC Bioinformatics. 2020 Jul 20;21(1):317. doi: 10.1186/s12859-020-03621-3.

DOI:10.1186/s12859-020-03621-3
PMID:32689977
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7370432/
Abstract

BACKGROUND

The binding sites of transcription factors (TFs) and the localisation of histone modifications in the human genome can be quantified by the chromatin immunoprecipitation assay coupled with next-generation sequencing (ChIP-seq). The resulting chromatin feature data has been successfully adopted for genome-wide enhancer identification by several unsupervised and supervised machine learning methods. However, the current methods predict different numbers and different sets of enhancers for the same cell type and do not utilise the pattern of the ChIP-seq coverage profiles efficiently.

RESULTS

In this work, we propose a PRobabilistic Enhancer PRedictIoN Tool (PREPRINT) that assumes characteristic coverage patterns of chromatin features at enhancers and employs a statistical model to account for their variability. PREPRINT defines probabilistic distance measures to quantify the similarity of the genomic query regions and the characteristic coverage patterns. The probabilistic scores of the enhancer and non-enhancer samples are utilised to train a kernel-based classifier. The performance of the method is demonstrated on ENCODE data for two cell lines. The predicted enhancers are computationally validated based on the transcriptional regulatory protein binding sites and compared to the predictions obtained by state-of-the-art methods.

CONCLUSION

PREPRINT performs favorably to the state-of-the-art methods, especially when requiring the methods to predict a larger set of enhancers. PREPRINT generalises successfully to data from cell type not utilised for training, and often the PREPRINT performs better than the previous methods. The PREPRINT enhancers are less sensitive to the choice of prediction threshold. PREPRINT identifies biologically validated enhancers not predicted by the competing methods. The enhancers predicted by PREPRINT can aid the genome interpretation in functional genomics and clinical studies.

摘要

背景

转录因子(TFs)的结合位点和组蛋白修饰在人类基因组中的定位可以通过与下一代测序(ChIP-seq)相结合的染色质免疫沉淀检测来定量。通过几种无监督和有监督的机器学习方法,这些产生的染色质特征数据已成功应用于全基因组增强子识别。然而,目前的方法为同一细胞类型预测了不同数量和不同集合的增强子,并且没有有效地利用 ChIP-seq 覆盖谱模式。

结果

在这项工作中,我们提出了一种概率增强子预测工具(PREPRINT),该工具假设增强子处染色质特征的特征覆盖模式,并采用统计模型来解释其可变性。PREPRINT 定义了概率距离度量来量化基因组查询区域和特征覆盖模式之间的相似性。增强子和非增强子样本的概率得分用于训练基于核的分类器。该方法的性能在两个细胞系的 ENCODE 数据上进行了演示。基于转录调控蛋白结合位点对预测的增强子进行了计算验证,并与最新方法的预测进行了比较。

结论

PREPRINT 的表现优于最新方法,特别是在需要方法预测更大集合的增强子时。PREPRINT 成功推广到未用于训练的数据,并且通常 PREPRINT 的表现优于以前的方法。PREPRINT 预测的增强子对预测阈值的选择不敏感。PREPRINT 识别了以前方法未预测的具有生物学验证的增强子。PREPRINT 预测的增强子可以帮助功能基因组学和临床研究中的基因组解释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/a9d46af9f70a/12859_2020_3621_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/cb16f8e43af3/12859_2020_3621_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/7f7576414879/12859_2020_3621_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/093700a078a7/12859_2020_3621_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/487b0bcc7d50/12859_2020_3621_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/765ed3d53f11/12859_2020_3621_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/b2b73de3aa4d/12859_2020_3621_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/74fce765c08a/12859_2020_3621_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/a9d46af9f70a/12859_2020_3621_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/cb16f8e43af3/12859_2020_3621_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/7f7576414879/12859_2020_3621_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/093700a078a7/12859_2020_3621_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/487b0bcc7d50/12859_2020_3621_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/765ed3d53f11/12859_2020_3621_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/b2b73de3aa4d/12859_2020_3621_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/74fce765c08a/12859_2020_3621_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfb2/7370432/a9d46af9f70a/12859_2020_3621_Fig8_HTML.jpg

相似文献

1
Enhancer prediction in the human genome by probabilistic modelling of the chromatin feature patterns.通过对染色质特征模式的概率建模来预测人类基因组中的增强子。
BMC Bioinformatics. 2020 Jul 20;21(1):317. doi: 10.1186/s12859-020-03621-3.
2
ChromaSig: a probabilistic approach to finding common chromatin signatures in the human genome.ChromaSig:一种在人类基因组中寻找常见染色质特征的概率方法。
PLoS Comput Biol. 2008 Oct;4(10):e1000201. doi: 10.1371/journal.pcbi.1000201. Epub 2008 Oct 17.
3
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.打开黑箱:一种基于可解释深度神经网络的细胞类型特异性增强子预测分类器。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3.
4
Enhancer identification in mouse embryonic stem cells using integrative modeling of chromatin and genomic features.利用染色质和基因组特征的综合建模识别小鼠胚胎干细胞中的增强子。
BMC Genomics. 2012 Apr 26;13:152. doi: 10.1186/1471-2164-13-152.
5
RFECS: a random-forest based algorithm for enhancer identification from chromatin state.RFECS:一种基于随机森林的算法,用于从染色质状态中识别增强子。
PLoS Comput Biol. 2013;9(3):e1002968. doi: 10.1371/journal.pcbi.1002968. Epub 2013 Mar 14.
6
DELTA: A Distal Enhancer Locating Tool Based on AdaBoost Algorithm and Shape Features of Chromatin Modifications.DELTA:一种基于AdaBoost算法和染色质修饰形状特征的远端增强子定位工具。
PLoS One. 2015 Jun 19;10(6):e0130622. doi: 10.1371/journal.pone.0130622. eCollection 2015.
7
Predicting enhancers in mammalian genomes using supervised hidden Markov models.利用监督隐马尔可夫模型预测哺乳动物基因组中的增强子。
BMC Bioinformatics. 2019 Mar 27;20(1):157. doi: 10.1186/s12859-019-2708-6.
8
Integrating diverse datasets improves developmental enhancer prediction.整合多种数据集可提高发育增强子预测的准确性。
PLoS Comput Biol. 2014 Jun 26;10(6):e1003677. doi: 10.1371/journal.pcbi.1003677. eCollection 2014 Jun.
9
Clustered ChIP-Seq-defined transcription factor binding sites and histone modifications map distinct classes of regulatory elements.成簇的 ChIP-Seq 定义的转录因子结合位点和组蛋白修饰图谱描绘了不同类别的调控元件。
BMC Biol. 2011 Nov 24;9:80. doi: 10.1186/1741-7007-9-80.
10
DEEP: a general computational framework for predicting enhancers.DEEP:一种预测增强子的通用计算框架。
Nucleic Acids Res. 2015 Jan;43(1):e6. doi: 10.1093/nar/gku1058. Epub 2014 Nov 5.

引用本文的文献

1
Predmoter-cross-species prediction of plant promoter and enhancer regions.植物启动子和增强子区域的启动子跨物种预测
Bioinform Adv. 2024 May 24;4(1):vbae074. doi: 10.1093/bioadv/vbae074. eCollection 2024.
2
DeepRegFinder: deep learning-based regulatory elements finder.DeepRegFinder:基于深度学习的调控元件查找工具。
Bioinform Adv. 2024 Jan 14;4(1):vbae007. doi: 10.1093/bioadv/vbae007. eCollection 2024.
3
Deep learning and support vector machines for transcription start site identification.用于转录起始位点识别的深度学习与支持向量机

本文引用的文献

1
HOT or not: examining the basis of high-occupancy target regions.热点还是冷门:考察高占有率目标区域的基础。
Nucleic Acids Res. 2019 Jun 20;47(11):5735-5745. doi: 10.1093/nar/gkz460.
2
Shaping the nebulous enhancer in the era of high-throughput assays and genome editing.在高通量检测和基因组编辑时代塑造模糊的增强子。
Brief Bioinform. 2020 May 21;21(3):836-850. doi: 10.1093/bib/bbz030.
3
GENCODE reference annotation for the human and mouse genomes.GENCODE 人类和小鼠基因组参考注释。
PeerJ Comput Sci. 2023 Apr 17;9:e1340. doi: 10.7717/peerj-cs.1340. eCollection 2023.
4
ChromDMM: a Dirichlet-multinomial mixture model for clustering heterogeneous epigenetic data.ChromDMM:一种用于聚类异质表观遗传数据的狄利克雷多项混合模型。
Bioinformatics. 2022 Aug 10;38(16):3863-3870. doi: 10.1093/bioinformatics/btac444.
Nucleic Acids Res. 2019 Jan 8;47(D1):D766-D773. doi: 10.1093/nar/gky955.
4
Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.基于监督深度学习方法的全基因组顺式调控区预测。
BMC Bioinformatics. 2018 May 31;19(1):202. doi: 10.1186/s12859-018-2187-1.
5
Enhancer Logic and Mechanics in Development and Disease.增强子逻辑与发育和疾病中的机制
Trends Cell Biol. 2018 Aug;28(8):608-630. doi: 10.1016/j.tcb.2018.04.003. Epub 2018 May 11.
6
Enhancers: bridging the gap between gene control and human disease.增强子:连接基因调控与人类疾病的桥梁。
Hum Mol Genet. 2018 Aug 1;27(R2):R219-R227. doi: 10.1093/hmg/ddy167.
7
EnrichedHeatmap: an R/Bioconductor package for comprehensive visualization of genomic signal associations.富集热图:一个用于全面可视化基因组信号关联的R/Bioconductor软件包。
BMC Genomics. 2018 Apr 4;19(1):234. doi: 10.1186/s12864-018-4625-x.
8
A survey of recently emerged genome-wide computational enhancer predictor tools.最近出现的全基因组计算增强子预测工具调查。
Comput Biol Chem. 2018 Jun;74:132-141. doi: 10.1016/j.compbiolchem.2018.03.019. Epub 2018 Mar 16.
9
The Human Transcription Factors.人类转录因子。
Cell. 2018 Feb 8;172(4):650-665. doi: 10.1016/j.cell.2018.01.029.
10
DNA methylation at enhancers identifies distinct breast cancer lineages.增强子上的 DNA 甲基化可识别出不同的乳腺癌谱系。
Nat Commun. 2017 Nov 9;8(1):1379. doi: 10.1038/s41467-017-00510-x.