Cao Yingying, Kitanovski Simo, Hoffmann Daniel
Bioinformatics and Computational Biophysics, Faculty of Biology and Center for Medical Biotechnology (ZMB), University of Duisburg-Essen, Universitätsstr.2, Essen, 45141, Germany.
BMC Genomics. 2020 Dec 29;21(Suppl 11):802. doi: 10.1186/s12864-020-07205-6.
RNA-Seq, the high-throughput sequencing (HT-Seq) of mRNAs, has become an essential tool for characterizing gene expression differences between different cell types and conditions. Gene expression is regulated by several mechanisms, including epigenetically by post-translational histone modifications which can be assessed by ChIP-Seq (Chromatin Immuno-Precipitation Sequencing). As more and more biological samples are analyzed by the combination of ChIP-Seq and RNA-Seq, the integrated analysis of the corresponding data sets becomes, theoretically, a unique option to study gene regulation. However, technically such analyses are still in their infancy.
Here we introduce intePareto, a computational tool for the integrative analysis of RNA-Seq and ChIP-Seq data. With intePareto we match RNA-Seq and ChIP-Seq data at the level of genes, perform differential expression analysis between biological conditions, and prioritize genes with consistent changes in RNA-Seq and ChIP-Seq data using Pareto optimization.
intePareto facilitates comprehensive understanding of high dimensional transcriptomic and epigenomic data. Its superiority to a naive differential gene expression analysis with RNA-Seq and available integrative approach is demonstrated by analyzing a public dataset.
RNA测序,即mRNA的高通量测序(HT-Seq),已成为表征不同细胞类型和条件之间基因表达差异的重要工具。基因表达受多种机制调控,包括通过翻译后组蛋白修饰进行的表观遗传调控,这种调控可通过染色质免疫沉淀测序(ChIP-Seq)进行评估。随着越来越多的生物样本通过ChIP-Seq和RNA-Seq的组合进行分析,理论上,对相应数据集的综合分析成为研究基因调控的唯一选择。然而,从技术上讲,此类分析仍处于起步阶段。
在此,我们介绍intePareto,一种用于RNA-Seq和ChIP-Seq数据综合分析的计算工具。使用intePareto,我们在基因水平上匹配RNA-Seq和ChIP-Seq数据,在生物学条件之间进行差异表达分析,并使用帕累托优化对RNA-Seq和ChIP-Seq数据中具有一致变化的基因进行优先级排序。
intePareto有助于全面理解高维转录组学和表观基因组学数据。通过分析一个公共数据集,证明了它相对于单纯的RNA-Seq差异基因表达分析和现有综合方法的优越性。