Suppr超能文献

利用 ChIP-Seq 数据的多读分析技术,在基因组的高度重复区域中发现转录因子结合位点。

Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data.

机构信息

Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America.

出版信息

PLoS Comput Biol. 2011 Jul;7(7):e1002111. doi: 10.1371/journal.pcbi.1002111. Epub 2011 Jul 14.

Abstract

Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments.

摘要

染色质免疫沉淀结合高通量测序(ChIP-seq)正在迅速取代染色质免疫沉淀联合全基因组平铺阵列分析(ChIP-chip),成为绘制转录因子结合位点和染色质修饰图谱的首选方法。分析 ChIP-seq 数据的最新方法依赖于仅使用唯一映射到相关参考基因组的读取(uni-reads)。这可能导致高达 30%的可对齐读取被遗漏。我们描述了一种利用可映射到参考基因组多个位置的读取(multi-reads)的通用方法。我们的方法基于使用加权对齐方案为 multi-reads 分配分数计数。使用人类 STAT1 和小鼠 GATA1 ChIP-seq 数据集,我们说明整合 multi-reads 可显著增加测序深度,可检测到无法用 uni-reads 检测到的新峰,并且可提高可映射区域峰的检测。我们通过计算实验研究了仅通过利用 multi-reads 检测到的峰的各种全基因组特征。总体而言,multi-read 分析得到的峰与用 uni-reads 鉴定的峰具有相似的特征,除了大多数峰位于片段重复区。我们通过独立的定量实时 ChIP 分析进一步验证了一些 GATA1 multi-read 仅有的峰,并鉴定了 GATA1 的新靶基因。这些计算和实验结果表明,multi-reads 对于用 ChIP-seq 实验研究基因组高度重复区的转录因子结合至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/107f/3136429/f0c288281f94/pcbi.1002111.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验