Algorithmic Bioinformatics, Center for Bioinformatics Saar, Saarland Informatics Campus, 66123 Saarbrücken, Germany.
Fakultät MI, Saarland University, Saarland Informatics Campus, 66123 Saarbrücken, Germany.
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae178.
Automated chromatin segmentation based on ChIP-seq (chromatin immunoprecipitation followed by sequencing) data reveals insights into the epigenetic regulation of chromatin accessibility. Existing segmentation methods are constrained by simplifying modeling assumptions, which may have a negative impact on the segmentation quality.
We introduce EpiSegMix, a novel segmentation method based on a hidden Markov model with flexible read count distribution types and state duration modeling, allowing for a more flexible modeling of both histone signals and segment lengths. In a comparison with existing tools, ChromHMM, Segway, and EpiCSeg, we show that EpiSegMix is more predictive of cell biology, such as gene expression. Its flexible framework enables it to fit an accurate probabilistic model, which has the potential to increase the biological interpretability of chromatin states.
Source code: https://gitlab.com/rahmannlab/episegmix.
基于 ChIP-seq(染色质免疫沉淀 followed by sequencing)数据的自动染色质分割揭示了组蛋白修饰对染色质可及性的调控作用。现有的分割方法受到简化建模假设的限制,这可能会对分割质量产生负面影响。
我们引入了 EpiSegMix,这是一种基于隐马尔可夫模型的新型分割方法,具有灵活的读取计数分布类型和状态持续时间建模,允许对组蛋白信号和片段长度进行更灵活的建模。与现有的工具 ChromHMM、Segway 和 EpiCSeg 进行比较,我们表明 EpiSegMix 更能预测细胞生物学,如基因表达。其灵活的框架使其能够拟合准确的概率模型,这有可能提高染色质状态的生物学可解释性。