Ashoor Haitham, Louis-Brennetot Caroline, Janoueix-Lerosey Isabelle, Bajic Vladimir B, Boeva Valentina
King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal 23955-6900, Saudi Arabia.
Institut Curie, Inserm U830, PSL Research University, F-75005, Paris, France.
Nucleic Acids Res. 2017 May 5;45(8):e58. doi: 10.1093/nar/gkw1319.
Comparing histone modification profiles between cancer and normal states, or across different tumor samples, can provide insights into understanding cancer initiation, progression and response to therapy. ChIP-seq histone modification data of cancer samples are distorted by copy number variation innate to any cancer cell. We present HMCan-diff, the first method designed to analyze ChIP-seq data to detect changes in histone modifications between two cancer samples of different genetic backgrounds, or between a cancer sample and a normal control. HMCan-diff explicitly corrects for copy number bias, and for other biases in the ChIP-seq data, which significantly improves prediction accuracy compared to methods that do not consider such corrections. On in silico simulated ChIP-seq data generated using genomes with differences in copy number profiles, HMCan-diff shows a much better performance compared to other methods that have no correction for copy number bias. Additionally, we benchmarked HMCan-diff on four experimental datasets, characterizing two histone marks in two different scenarios. We correlated changes in histone modifications between a cancer and a normal control sample with changes in gene expression. On all experimental datasets, HMCan-diff demonstrated better performance compared to the other methods.
比较癌症状态与正常状态之间,或不同肿瘤样本之间的组蛋白修饰谱,有助于深入了解癌症的发生、发展以及对治疗的反应。癌症样本的ChIP-seq组蛋白修饰数据会因癌细胞固有的拷贝数变异而失真。我们提出了HMCan-diff,这是第一种旨在分析ChIP-seq数据以检测不同遗传背景的两个癌症样本之间,或癌症样本与正常对照之间组蛋白修饰变化的方法。HMCan-diff明确校正了拷贝数偏差以及ChIP-seq数据中的其他偏差,与未考虑此类校正的方法相比,显著提高了预测准确性。在使用具有不同拷贝数谱的基因组生成的计算机模拟ChIP-seq数据上,HMCan-diff与其他未校正拷贝数偏差的方法相比表现出更好的性能。此外,我们在四个实验数据集上对HMCan-diff进行了基准测试,在两种不同情况下表征了两种组蛋白标记。我们将癌症样本与正常对照样本之间组蛋白修饰的变化与基因表达的变化进行了关联。在所有实验数据集上,HMCan-diff与其他方法相比都表现出更好的性能。