Klein Hans-Ulrich, Schäfer Martin, Porse Bo T, Hasemann Marie S, Ickstadt Katja, Dugas Martin
Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany.
Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany.
Bioinformatics. 2014 Apr 15;30(8):1154-1162. doi: 10.1093/bioinformatics/btu003. Epub 2014 Jan 7.
Histone modifications are a key epigenetic mechanism to activate or repress the transcription of genes. Datasets of matched transcription data and histone modification data obtained by ChIP-seq exist, but methods for integrative analysis of both data types are still rare. Here, we present a novel bioinformatics approach to detect genes that show different transcript abundances between two conditions putatively caused by alterations in histone modification.
We introduce a correlation measure for integrative analysis of ChIP-seq and gene transcription data measured by RNA sequencing or microarrays and demonstrate that a proper normalization of ChIP-seq data is crucial. We suggest applying Bayesian mixture models of different types of distributions to further study the distribution of the correlation measure. The implicit classification of the mixture models is used to detect genes with differences between two conditions in both gene transcription and histone modification. The method is applied to different datasets, and its superiority to a naive separate analysis of both data types is demonstrated.
R/Bioconductor package epigenomix.
h.klein@uni-muenster.de Supplementary information: Supplementary data are available at Bioinformatics online.
组蛋白修饰是激活或抑制基因转录的关键表观遗传机制。存在通过ChIP-seq获得的匹配转录数据和组蛋白修饰数据的数据集,但对这两种数据类型进行综合分析的方法仍然很少。在这里,我们提出了一种新颖的生物信息学方法,用于检测在两种条件下可能由于组蛋白修饰改变而显示出不同转录丰度的基因。
我们引入了一种相关性度量,用于对通过RNA测序或微阵列测量的ChIP-seq和基因转录数据进行综合分析,并证明对ChIP-seq数据进行适当的归一化至关重要。我们建议应用不同类型分布的贝叶斯混合模型来进一步研究相关性度量的分布。混合模型的隐式分类用于检测在基因转录和组蛋白修饰方面两种条件之间存在差异的基因。该方法应用于不同的数据集,并证明了其相对于对两种数据类型进行简单单独分析的优越性。
R/Bioconductor包epigenomix。
h.klein@uni-muenster.de 补充信息:补充数据可在《生物信息学》在线获取。