Suppr超能文献

因未注释的高拷贝数区域导致 ChIP-seq 和其他基于测序的功能测定出现假阳性峰。

False positive peaks in ChIP-seq and other sequencing-based functional assays caused by unannotated high copy number regions.

机构信息

Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.

出版信息

Bioinformatics. 2011 Aug 1;27(15):2144-6. doi: 10.1093/bioinformatics/btr354. Epub 2011 Jun 19.

Abstract

MOTIVATION

Sequencing-based assays such as ChIP-seq, DNase-seq and MNase-seq have become important tools for genome annotation. In these assays, short sequence reads enriched for loci of interest are mapped to a reference genome to determine their origin. Here, we consider whether false positive peak calls can be caused by particular type of error in the reference genome: multicopy sequences which have been incorrectly assembled and collapsed into a single copy.

RESULTS

Using sequencing data from the 1000 Genomes Project, we systematically scanned the human genome for regions of high sequencing depth. These regions are highly enriched for erroneously inferred transcription factor binding sites, positions of nucleosomes and regions of open chromatin. We suggest a simple masking procedure to remove these regions and reduce false positive calls.

AVAILABILITY

Files for masking out these regions are available at eqtl.uchicago.edu

摘要

动机

基于测序的测定方法,如 ChIP-seq、DNase-seq 和 MNase-seq,已成为基因组注释的重要工具。在这些测定方法中,富集感兴趣基因座的短序列读取被映射到参考基因组上,以确定其来源。在这里,我们考虑参考基因组中的特定类型错误是否会导致假阳性峰调用:多拷贝序列被错误组装并折叠成单个拷贝。

结果

我们使用来自 1000 基因组计划的测序数据,系统地扫描人类基因组中测序深度较高的区域。这些区域高度富含错误推断的转录因子结合位点、核小体位置和开放染色质区域。我们建议使用一种简单的屏蔽程序来删除这些区域并减少假阳性调用。

可用性

可在 eqtl.uchicago.edu 获得用于屏蔽这些区域的文件。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c89/3137225/bc15290efb94/btr354f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验