Suppr超能文献

从技术条件的角度选择ChIP-seq标准化方法。

Selecting ChIP-seq normalization methods from the perspective of their technical conditions.

作者信息

Colando Sara, Schulz Danae, Hardin Johanna

机构信息

Department of Statistics & Data Science, Carnegie Mellon University, 4909 Frew St., Pittsburgh, PA 15213, United States.

Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, CA 91711, United States.

出版信息

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf431.

Abstract

Chromatin immunoprecipitation with high-throughput sequencing (ChIP-seq) provides insights into both the genomic location occupied by the protein of interest and the difference in DNA occupancy between experimental states. Given that ChIP-seq data are collected experimentally, an important step for determining regions with differential DNA occupancy between states is between-sample normalization. While between-sample normalization is crucial for downstream differential binding analysis, the technical conditions underlying between-sample normalization methods have yet to be examined for ChIP-seq. We identify three important technical conditions underlying ChIP-seq between-sample normalization methods: balanced differential DNA occupancy, equal total DNA occupancy, and equal background binding across states. To illustrate the importance of satisfying the selected normalization method's technical conditions for downstream differential binding analysis, we simulate ChIP-seq read count data where different combinations of the technical conditions are violated. We then externally verify our simulation results using experimental data. Based on our findings, we suggest that researchers use their understanding of the ChIP-seq experiment at hand to guide their choice of between-sample normalization method. Alternatively, researchers can use a high-confidence peakset, which is the intersection of the differentially bound peaksets obtained from using different between-sample normalization methods. In our two experimental analyses, roughly half of the called peaks were called as differentially bound for every normalization method. High-confidence peaks are less sensitive to one's choice of between-sample normalization method, and thus could be a more robust basis for identifying genomic regions with differential DNA occupancy between experimental states when there is uncertainty about which technical conditions are satisfied.

摘要

高通量测序染色质免疫沉淀技术(ChIP-seq)有助于深入了解目标蛋白所占据的基因组位置以及不同实验状态下DNA占有率的差异。鉴于ChIP-seq数据是通过实验收集的,样本间归一化是确定不同状态下DNA占有率存在差异区域的重要步骤。虽然样本间归一化对于下游差异结合分析至关重要,但ChIP-seq样本间归一化方法背后的技术条件尚未得到研究。我们确定了ChIP-seq样本间归一化方法背后的三个重要技术条件:平衡的差异DNA占有率、相等的总DNA占有率以及不同状态间相等的背景结合。为了说明满足所选归一化方法的技术条件对下游差异结合分析的重要性,我们模拟了违反不同技术条件组合的ChIP-seq读取计数数据。然后,我们使用实验数据对外验证了我们的模拟结果。基于我们的发现,我们建议研究人员利用对现有ChIP-seq实验的理解来指导他们对样本间归一化方法的选择。或者,研究人员可以使用高置信度峰集,即使用不同样本间归一化方法获得的差异结合峰集的交集。在我们的两项实验分析中,每种归一化方法大约一半的调用峰被称为差异结合峰。高置信度峰对样本间归一化方法的选择不太敏感,因此当不确定满足哪些技术条件时,它可能是识别不同实验状态下具有差异DNA占有率的基因组区域的更可靠基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41d1/12368857/227dc2914cb2/bbaf431f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验