Teng Mingxiang
Department of Biostatistics and Bioinformatics, Moffitt Cancer Center, Tampa, FL, USA.
Methods Mol Biol. 2023;2629:169-181. doi: 10.1007/978-1-0716-2986-4_9.
Chromatin immunoprecipitation sequencing (ChIP-seq) has been widely performed to identify protein binding information along the genome. The sequencing protocol is quite flexible and mature to measure different types of protein binding as long as sequencing parameters are properly tailored to accommodate protein features. Two distinct types of protein binding are point-source-like binding by transcription factors and diffused-distribution binding by histone modifications. Consequently, statistical approaches have been proposed to address ChIP-seq-related questions according to different protein features. In this chapter, we briefly summarize statistical principles, approaches, and tools that are widely implemented in modeling ChIP-seq data, from raw data quality control to final result reporting. We discuss the key solutions in addressing eight routine questions in ChIP-seq applications. We also include discussion on approaches fitting unique data features in different ChIP-seq types. We hope this chapter will serve as a brief guide, especially for ChIP-seq beginners, to provide them with a high-level overview to understand and design processing plans for their ChIP-seq experiments.
染色质免疫沉淀测序(ChIP-seq)已被广泛用于识别全基因组范围内的蛋白质结合信息。只要根据蛋白质特征适当调整测序参数,该测序方案在测量不同类型的蛋白质结合方面相当灵活且成熟。蛋白质结合有两种不同类型,即转录因子的点状结合和组蛋白修饰的扩散分布结合。因此,已提出统计方法来根据不同的蛋白质特征解决与ChIP-seq相关的问题。在本章中,我们简要总结了在ChIP-seq数据建模中广泛应用的统计原理、方法和工具,从原始数据质量控制到最终结果报告。我们讨论了解决ChIP-seq应用中八个常规问题的关键解决方案。我们还讨论了适合不同ChIP-seq类型独特数据特征的方法。我们希望本章能作为一个简要指南,特别是对于ChIP-seq初学者,为他们提供一个高层次的概述,以理解和设计其ChIP-seq实验的处理方案。