Ji Hongkai, Li Xia, Wang Qian-fei, Ning Yang
Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD 21205, USA.
Proc Natl Acad Sci U S A. 2013 Apr 23;110(17):6789-94. doi: 10.1073/pnas.1204398110. Epub 2013 Apr 8.
We propose differential principal component analysis (dPCA) for analyzing multiple ChIP-sequencing datasets to identify differential protein-DNA interactions between two biological conditions. dPCA integrates unsupervised pattern discovery, dimension reduction, and statistical inference into a single framework. It uses a small number of principal components to summarize concisely the major multiprotein synergistic differential patterns between the two conditions. For each pattern, it detects and prioritizes differential genomic loci by comparing the between-condition differences with the within-condition variation among replicate samples. dPCA provides a unique tool for efficiently analyzing large amounts of ChIP-sequencing data to study dynamic changes of gene regulation across different biological conditions. We demonstrate this approach through analyses of differential chromatin patterns at transcription factor binding sites and promoters as well as allele-specific protein-DNA interactions.
我们提出了差异主成分分析(dPCA)方法,用于分析多个染色质免疫沉淀测序(ChIP-seq)数据集,以识别两种生物学条件之间的差异蛋白质-DNA相互作用。dPCA将无监督模式发现、降维和统计推断整合到一个框架中。它使用少量主成分来简洁地概括两种条件之间主要的多蛋白协同差异模式。对于每种模式,通过比较条件间差异与重复样本内条件变化,它检测差异基因组位点并确定其优先级。dPCA为高效分析大量ChIP-seq数据以研究不同生物学条件下基因调控的动态变化提供了一个独特工具。我们通过分析转录因子结合位点和启动子处的差异染色质模式以及等位基因特异性蛋白质-DNA相互作用来证明这种方法。