• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

稀疏相关隐马尔可夫模型及其在全基因组定位研究中的应用。

Sparsely correlated hidden Markov models with application to genome-wide location studies.

机构信息

National University of Singapore and National University Health System, Singapore 117597, Singapore.

出版信息

Bioinformatics. 2013 Mar 1;29(5):533-41. doi: 10.1093/bioinformatics/btt012. Epub 2013 Jan 16.

DOI:10.1093/bioinformatics/btt012
PMID:23325620
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3582268/
Abstract

MOTIVATION

Multiply correlated datasets have become increasingly common in genome-wide location analysis of regulatory proteins and epigenetic modifications. Their correlation can be directly incorporated into a statistical model to capture underlying biological interactions, but such modeling quickly becomes computationally intractable.

RESULTS

We present sparsely correlated hidden Markov models (scHMM), a novel method for performing simultaneous hidden Markov model (HMM) inference for multiple genomic datasets. In scHMM, a single HMM is assumed for each series, but the transition probability in each series depends on not only its own hidden states but also the hidden states of other related series. For each series, scHMM uses penalized regression to select a subset of the other data series and estimate their effects on the odds of each transition in the given series. Following this, hidden states are inferred using a standard forward-backward algorithm, with the transition probabilities adjusted by the model at each position, which helps retain the order of computation close to fitting independent HMMs (iHMM). Hence, scHMM is a collection of inter-dependent non-homogeneous HMMs, capable of giving a close approximation to a fully multivariate HMM fit. A simulation study shows that scHMM achieves comparable sensitivity to the multivariate HMM fit at a much lower computational cost. The method was demonstrated in the joint analysis of 39 histone modifications, CTCF and RNA polymerase II in human CD4+ T cells. scHMM reported fewer high-confidence regions than iHMM in this dataset, but scHMM could recover previously characterized histone modifications in relevant genomic regions better than iHMM. In addition, the resulting combinatorial patterns from scHMM could be better mapped to the 51 states reported by the multivariate HMM method of Ernst and Kellis.

AVAILABILITY

The scHMM package can be freely downloaded from http://sourceforge.net/p/schmm/ and is recommended for use in a linux environment.

摘要

动机

在调节蛋白和表观遗传修饰的全基因组位置分析中,多重相关数据集变得越来越常见。它们的相关性可以直接纳入统计模型中,以捕获潜在的生物学相互作用,但这种建模很快变得计算上难以处理。

结果

我们提出了稀疏相关隐马尔可夫模型(scHMM),这是一种用于对多个基因组数据集同时进行隐马尔可夫模型(HMM)推断的新方法。在 scHMM 中,假设每个系列都有一个单独的 HMM,但每个系列中的转移概率不仅取决于其自身的隐藏状态,还取决于其他相关系列的隐藏状态。对于每个系列,scHMM 使用惩罚回归选择其他数据系列的子集,并估计它们对给定系列中每个转移的几率的影响。之后,使用标准的前向-后向算法推断隐藏状态,在每个位置调整模型的转移概率,这有助于保持计算顺序接近拟合独立 HMM(iHMM)。因此,scHMM 是一组相互依赖的非齐次 HMM,可以非常接近地逼近完全多元 HMM 拟合。一项模拟研究表明,scHMM 在计算成本低得多的情况下,达到了与多元 HMM 拟合相当的灵敏度。该方法在人类 CD4+T 细胞中 39 种组蛋白修饰、CTCF 和 RNA 聚合酶 II 的联合分析中得到了验证。在这个数据集上,scHMM 报告的高可信度区域比 iHMM 少,但 scHMM 可以比 iHMM 更好地恢复相关基因组区域中先前表征的组蛋白修饰。此外,scHMM 产生的组合模式可以更好地映射到 Ernst 和 Kellis 的多元 HMM 方法报告的 51 个状态。

可用性

scHMM 包可以从 http://sourceforge.net/p/schmm/ 免费下载,建议在 Linux 环境中使用。

相似文献

1
Sparsely correlated hidden Markov models with application to genome-wide location studies.稀疏相关隐马尔可夫模型及其在全基因组定位研究中的应用。
Bioinformatics. 2013 Mar 1;29(5):533-41. doi: 10.1093/bioinformatics/btt012. Epub 2013 Jan 16.
2
Computationally Tractable Multivariate HMM in Genome-Wide Mapping Studies.全基因组图谱研究中计算上易于处理的多变量隐马尔可夫模型
Methods Mol Biol. 2017;1552:135-148. doi: 10.1007/978-1-4939-6753-7_10.
3
Annotation of genomics data using bidirectional hidden Markov models unveils variations in Pol II transcription cycle.使用双向隐马尔可夫模型对基因组学数据进行注释揭示了RNA聚合酶II转录周期的变化。
Mol Syst Biol. 2014 Dec 19;10(12):768. doi: 10.15252/msb.20145654.
4
Hidden Markov Models in Bioinformatics: SNV Inference from Next Generation Sequence.生物信息学中的隐马尔可夫模型:从下一代测序中推断单核苷酸变异
Methods Mol Biol. 2017;1552:123-133. doi: 10.1007/978-1-4939-6753-7_9.
5
MRHMMs: multivariate regression hidden Markov models and the variantS.MRHMMs:多元回归隐马尔可夫模型及其变体
Bioinformatics. 2014 Jun 15;30(12):1755-6. doi: 10.1093/bioinformatics/btu070. Epub 2014 Feb 19.
6
FactorialHMM: fast and exact inference in factorial hidden Markov models.因子隐马尔可夫模型:因子隐马尔可夫模型中的快速精确推理。
Bioinformatics. 2019 Jun 1;35(12):2162-2164. doi: 10.1093/bioinformatics/bty944.
7
A hidden Markov model to identify combinatorial epigenetic regulation patterns for estrogen receptor α target genes.一种用于识别雌激素受体α靶基因组合表观遗传调控模式的隐马尔可夫模型。
Bioinformatics. 2013 Jan 1;29(1):22-8. doi: 10.1093/bioinformatics/bts639. Epub 2012 Oct 26.
8
An HMM approach to genome-wide identification of differential histone modification sites from ChIP-seq data.一种基于隐马尔可夫模型从染色质免疫沉淀测序(ChIP-seq)数据中全基因组鉴定差异组蛋白修饰位点的方法。
Bioinformatics. 2008 Oct 15;24(20):2344-9. doi: 10.1093/bioinformatics/btn402. Epub 2008 Jul 29.
9
Inference of genomic landscapes using ordered Hidden Markov Models with emission densities (oHMMed).使用具有发射密度的有序隐马尔可夫模型(oHMMed)进行基因组景观推断。
BMC Bioinformatics. 2024 Apr 16;25(1):151. doi: 10.1186/s12859-024-05751-4.
10
PSE-HMM: genome-wide CNV detection from NGS data using an HMM with Position-Specific Emission probabilities.PSE-HMM:利用具有位置特异性发射概率的隐马尔可夫模型从二代测序数据中进行全基因组拷贝数变异检测。
BMC Bioinformatics. 2016 Nov 3;18(1):30. doi: 10.1186/s12859-016-1296-y.

引用本文的文献

1
Bayesian adaptive group lasso with semiparametric hidden Markov models.贝叶斯自适应分组 lasso 与半参数隐马尔可夫模型。
Stat Med. 2019 Apr 30;38(9):1634-1650. doi: 10.1002/sim.8051. Epub 2018 Nov 28.
2
Chromatin-state discovery and genome annotation with ChromHMM.使用ChromHMM进行染色质状态发现和基因组注释。
Nat Protoc. 2017 Dec;12(12):2478-2492. doi: 10.1038/nprot.2017.124. Epub 2017 Nov 9.
3
Integrating Epigenomics into the Understanding of Biomedical Insight.将表观基因组学融入对生物医学见解的理解中。
Bioinform Biol Insights. 2016 Dec 4;10:267-289. doi: 10.4137/BBI.S38427. eCollection 2016.
4
Joint analysis of expression profiles from multiple cancers improves the identification of microRNA-gene interactions.多癌种表达谱的联合分析提高了 microRNA-基因相互作用的识别能力。
Bioinformatics. 2013 Sep 1;29(17):2137-45. doi: 10.1093/bioinformatics/btt341. Epub 2013 Jun 14.

本文引用的文献

1
Regularization Paths for Generalized Linear Models via Coordinate Descent.基于坐标下降法的广义线性模型正则化路径
J Stat Softw. 2010;33(1):1-22.
2
Discovery and characterization of chromatin states for systematic annotation of the human genome.发现和描述染色质状态,用于系统注释人类基因组。
Nat Biotechnol. 2010 Aug;28(8):817-25. doi: 10.1038/nbt.1662. Epub 2010 Jul 25.
3
HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data.HPeak:一种基于隐马尔可夫模型的算法,用于定义 ChIP-Seq 数据中的读取富集区域。
BMC Bioinformatics. 2010 Jul 2;11:369. doi: 10.1186/1471-2105-11-369.
4
Genome-wide mapping of HATs and HDACs reveals distinct functions in active and inactive genes.全基因组范围内对组蛋白乙酰转移酶(HATs)和组蛋白去乙酰化酶(HDACs)的图谱绘制揭示了它们在活跃基因和非活跃基因中的不同功能。
Cell. 2009 Sep 4;138(5):1019-31. doi: 10.1016/j.cell.2009.06.049. Epub 2009 Aug 20.
5
Hierarchical hidden Markov model with application to joint analysis of ChIP-chip and ChIP-seq data.用于ChIP-chip和ChIP-seq数据联合分析的分层隐马尔可夫模型
Bioinformatics. 2009 Jul 15;25(14):1715-21. doi: 10.1093/bioinformatics/btp312. Epub 2009 May 14.
6
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.短DNA序列与人类基因组的超快速且内存高效比对。
Genome Biol. 2009;10(3):R25. doi: 10.1186/gb-2009-10-3-r25. Epub 2009 Mar 4.
7
Combinatorial patterns of histone acetylations and methylations in the human genome.人类基因组中组蛋白乙酰化和甲基化的组合模式。
Nat Genet. 2008 Jul;40(7):897-903. doi: 10.1038/ng.154. Epub 2008 Jun 15.
8
Genome-wide maps of chromatin state in pluripotent and lineage-committed cells.多能细胞和谱系定向细胞中染色质状态的全基因组图谱。
Nature. 2007 Aug 2;448(7153):553-60. doi: 10.1038/nature06008. Epub 2007 Jul 1.
9
The landscape of histone modifications across 1% of the human genome in five human cell lines.在五种人类细胞系中,对人类基因组1%的区域进行组蛋白修饰的情况。
Genome Res. 2007 Jun;17(6):691-707. doi: 10.1101/gr.5704207.
10
Genome-wide mapping of in vivo protein-DNA interactions.体内蛋白质-DNA相互作用的全基因组图谱绘制。
Science. 2007 Jun 8;316(5830):1497-502. doi: 10.1126/science.1141319. Epub 2007 May 31.