• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人类表观基因组的连续染色质状态特征注释。

Continuous chromatin state feature annotation of the human epigenome.

机构信息

School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada.

出版信息

Bioinformatics. 2022 May 26;38(11):3029-3036. doi: 10.1093/bioinformatics/btac283.

DOI:10.1093/bioinformatics/btac283
PMID:35451453
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9154241/
Abstract

MOTIVATION

Segmentation and genome annotation (SAGA) algorithms are widely used to understand genome activity and gene regulation. These methods take as input a set of sequencing-based assays of epigenomic activity, such as ChIP-seq measurements of histone modification and transcription factor binding. They output an annotation of the genome that assigns a chromatin state label to each genomic position. Existing SAGA methods have several limitations caused by the discrete annotation framework: such annotations cannot easily represent varying strengths of genomic elements, and they cannot easily represent combinatorial elements that simultaneously exhibit multiple types of activity. To remedy these limitations, we propose an annotation strategy that instead outputs a vector of chromatin state features at each position rather than a single discrete label. Continuous modeling is common in other fields, such as in topic modeling of text documents. We propose a method, epigenome-ssm-nonneg, that uses a non-negative state space model to efficiently annotate the genome with chromatin state features. We also propose several measures of the quality of a chromatin state feature annotation and we compare the performance of several alternative methods according to these quality measures.

RESULTS

We show that chromatin state features from epigenome-ssm-nonneg are more useful for several downstream applications than both continuous and discrete alternatives, including their ability to identify expressed genes and enhancers. Therefore, we expect that these continuous chromatin state features will be valuable reference annotations to be used in visualization and downstream analysis.

AVAILABILITY AND IMPLEMENTATION

Source code for epigenome-ssm is available at https://github.com/habibdanesh/epigenome-ssm and Zenodo (DOI: 10.5281/zenodo.6507585).

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

分割和基因组注释(SAGA)算法被广泛用于理解基因组的活性和基因调控。这些方法以基于测序的一组表观基因组活性测定为输入,例如组蛋白修饰和转录因子结合的 ChIP-seq 测量。它们输出基因组的注释,为每个基因组位置分配染色质状态标签。现有的 SAGA 方法由于离散注释框架存在几个限制:这种注释不能轻易地表示基因组元件的不同强度,也不能轻易地表示同时表现出多种类型活性的组合元件。为了弥补这些限制,我们提出了一种注释策略,该策略不是输出每个位置的单个离散标签,而是输出一组染色质状态特征。连续建模在其他领域很常见,例如在文本文档的主题建模中。我们提出了一种方法 epigenome-ssm-nonneg,它使用非负状态空间模型来有效地用染色质状态特征注释基因组。我们还提出了几种评估染色质状态特征注释质量的度量标准,并根据这些质量度量标准比较了几种替代方法的性能。

结果

我们表明,与连续和离散替代方案相比,来自 epigenome-ssm-nonneg 的染色质状态特征在几个下游应用中更有用,包括识别表达基因和增强子的能力。因此,我们期望这些连续的染色质状态特征将成为可视化和下游分析中有用的参考注释。

可用性和实现

epigenome-ssm 的源代码可在 https://github.com/habibdanesh/epigenome-ssm 和 Zenodo(DOI:10.5281/zenodo.6507585)上获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac1/9154241/2d57e3f1c6de/btac283f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac1/9154241/19db01142aa9/btac283f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac1/9154241/50301b14a9af/btac283f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac1/9154241/d08e31b1721f/btac283f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac1/9154241/af2bed754a1e/btac283f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac1/9154241/2d57e3f1c6de/btac283f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac1/9154241/19db01142aa9/btac283f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac1/9154241/50301b14a9af/btac283f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac1/9154241/d08e31b1721f/btac283f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac1/9154241/af2bed754a1e/btac283f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fac1/9154241/2d57e3f1c6de/btac283f5.jpg

相似文献

1
Continuous chromatin state feature annotation of the human epigenome.人类表观基因组的连续染色质状态特征注释。
Bioinformatics. 2022 May 26;38(11):3029-3036. doi: 10.1093/bioinformatics/btac283.
2
StereoGene: rapid estimation of genome-wide correlation of continuous or interval feature data.StereoGene:快速估计连续或区间特征数据的全基因组相关性。
Bioinformatics. 2017 Oct 15;33(20):3158-3165. doi: 10.1093/bioinformatics/btx379.
3
Segmentation and genome annotation algorithms for identifying chromatin state and other genomic patterns.用于识别染色质状态和其他基因组模式的分割和基因组注释算法。
PLoS Comput Biol. 2021 Oct 14;17(10):e1009423. doi: 10.1371/journal.pcbi.1009423. eCollection 2021 Oct.
4
Robust chromatin state annotation.稳健的染色质状态注释。
Genome Res. 2024 Apr 25;34(3):469-483. doi: 10.1101/gr.278343.123.
5
Chromatin segmentation based on a probabilistic model for read counts explains a large portion of the epigenome.基于读取计数概率模型的染色质分割解释了大部分表观基因组。
Genome Biol. 2015 Jul 24;16(1):151. doi: 10.1186/s13059-015-0708-z.
6
IMPACT: Genomic Annotation of Cell-State-Specific Regulatory Elements Inferred from the Epigenome of Bound Transcription Factors.影响:从结合转录因子的表观基因组推断细胞状态特异性调控元件的基因组注释。
Am J Hum Genet. 2019 May 2;104(5):879-895. doi: 10.1016/j.ajhg.2019.03.012. Epub 2019 Apr 18.
7
Universal annotation of the human genome through integration of over a thousand epigenomic datasets.通过整合一千多个表观基因组数据集实现人类基因组的通用注释。
Genome Biol. 2022 Jan 6;23(1):9. doi: 10.1186/s13059-021-02572-z.
8
A framework for group-wise summarization and comparison of chromatin state annotations.一种用于对染色质状态注释进行分组总结和比较的框架。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac722.
9
Chromatin-state discovery and genome annotation with ChromHMM.使用ChromHMM进行染色质状态发现和基因组注释。
Nat Protoc. 2017 Dec;12(12):2478-2492. doi: 10.1038/nprot.2017.124. Epub 2017 Nov 9.
10
ChIP-Atlas 3.0: a data-mining suite to explore chromosome architecture together with large-scale regulome data.ChIP-Atlas 3.0:一个数据挖掘套件,用于探索染色体结构以及大规模调控组数据。
Nucleic Acids Res. 2024 Jul 5;52(W1):W45-W53. doi: 10.1093/nar/gkae358.

引用本文的文献

1
EpiSegMix: a flexible distribution hidden Markov model with duration modeling for chromatin state discovery.EpiSegMix:一种具有持续时间建模功能的灵活分布隐马尔可夫模型,用于发现染色质状态。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae178.
2
Robust chromatin state annotation.稳健的染色质状态注释。
Genome Res. 2024 Apr 25;34(3):469-483. doi: 10.1101/gr.278343.123.
3
Segmentation and genome annotation algorithms for identifying chromatin state and other genomic patterns.用于识别染色质状态和其他基因组模式的分割和基因组注释算法。

本文引用的文献

1
SigTools: exploratory visualization for genomic signals.SigTools:基因组信号的探索性可视化工具。
Bioinformatics. 2022 Jan 27;38(4):1126-1128. doi: 10.1093/bioinformatics/btab742.
2
Segmentation and genome annotation algorithms for identifying chromatin state and other genomic patterns.用于识别染色质状态和其他基因组模式的分割和基因组注释算法。
PLoS Comput Biol. 2021 Oct 14;17(10):e1009423. doi: 10.1371/journal.pcbi.1009423. eCollection 2021 Oct.
3
Latent Representation of the Human Pan-Celltype Epigenome Through a Deep Recurrent Neural Network.
PLoS Comput Biol. 2021 Oct 14;17(10):e1009423. doi: 10.1371/journal.pcbi.1009423. eCollection 2021 Oct.
通过深度递归神经网络对人类多细胞类型表观基因组进行潜在表示。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2313-2323. doi: 10.1109/TCBB.2021.3084147. Epub 2022 Aug 8.
4
Avocado: a multi-scale deep tensor factorization method learns a latent representation of the human epigenome.鳄梨:一种多尺度深度张量分解方法,可学习人类表观基因组的潜在表示。
Genome Biol. 2020 Mar 30;21(1):81. doi: 10.1186/s13059-020-01977-6.
5
A unified encyclopedia of human functional DNA elements through fully automated annotation of 164 human cell types.通过对 164 个人类细胞类型的全自动注释,构建人类功能 DNA 元件的统一百科全书。
Genome Biol. 2019 Aug 28;20(1):180. doi: 10.1186/s13059-019-1784-2.
6
PREDICTD PaRallel Epigenomics Data Imputation with Cloud-based Tensor Decomposition.基于云的张量分解预测并行表观基因组学数据插补。
Nat Commun. 2018 Apr 11;9(1):1402. doi: 10.1038/s41467-018-03635-9.
7
Accurate and reproducible functional maps in 127 human cell types via 2D genome segmentation.通过二维基因组分割在127种人类细胞类型中生成准确且可重复的功能图谱。
Nucleic Acids Res. 2017 Sep 29;45(17):9823-9836. doi: 10.1093/nar/gkx659.
8
Jointly characterizing epigenetic dynamics across multiple human cell types.联合表征多种人类细胞类型中的表观遗传动力学。
Nucleic Acids Res. 2016 Aug 19;44(14):6721-31. doi: 10.1093/nar/gkw278. Epub 2016 Apr 19.
9
Probabilistic modelling of chromatin code landscape reveals functional diversity of enhancer-like chromatin states.染色质编码景观的概率建模揭示了增强子样染色质状态的功能多样性。
Nat Commun. 2016 Feb 4;7:10528. doi: 10.1038/ncomms10528.
10
Chromatin segmentation based on a probabilistic model for read counts explains a large portion of the epigenome.基于读取计数概率模型的染色质分割解释了大部分表观基因组。
Genome Biol. 2015 Jul 24;16(1):151. doi: 10.1186/s13059-015-0708-z.