Suppr超能文献

一种在 R 中简化 ATAC-cap-seq 数据分析的工作流程。

A workflow for simplified analysis of ATAC-cap-seq data in R.

机构信息

Sainsbury Laboratory, Norwich Research Park, Norwich, UK, NR4 7UH.

出版信息

Gigascience. 2018 Jul 1;7(7). doi: 10.1093/gigascience/giy080.

Abstract

BACKGROUND

Assay for Transposase-Accessible Chromatin (ATAC)-cap-seq is a high-throughput sequencing method that combines ATAC-seq with targeted nucleic acid enrichment of precipitated DNA fragments. There are increased analytical difficulties arising from working with a set of regions of interest that may be small in number and biologically dependent. Common statistical pipelines for RNA sequencing might be assumed to apply but can give misleading results on ATAC-cap-seq data. A tool is needed to allow a nonspecialist user to quickly and easily summarize data and apply sensible and effective normalization and analysis.

RESULTS

We developed atacR to allow a user to easily analyze their ATAC enrichment experiment. It provides comprehensive summary functions and diagnostic plots for studying enriched tag abundance. Application of between-sample normalization is made straightforward. Functions for normalizing based on user-defined control regions, whole library size, and regions selected from the least variable regions in a dataset are provided. Three methods for detecting differential abundance of tags from enriched methods are provided, including bootstrap t, Bayes factor, and a wrapped version of the standard exact test in the edgeR package. We compared the precision, recall, and F-score of each detection method on resampled datasets at varying replicate, significance threshold, and genes changed and found that the Bayes factor method had the greatest overall detection power, though edgeR was slightly stronger in simulations with lower numbers of genes changed.

CONCLUSIONS

Our package allows a nonspecialist user to easily and effectively apply methods appropriate to the analysis of ATAC-cap-seq in a reproducible manner. The package is implemented in pure R and is fully interoperable with common workflows in Bioconductor.

摘要

背景

转座酶可及染色质(ATAC)-cap-seq 是一种高通量测序方法,它将 ATAC-seq 与沉淀 DNA 片段的靶向核酸富集相结合。由于处理的是一组数量可能较少且依赖于生物学的感兴趣区域,因此会出现分析上的困难。可能假设 RNA 测序的常见统计管道适用,但在 ATAC-cap-seq 数据上可能会产生误导性结果。需要一种工具来允许非专业用户快速轻松地总结数据,并应用合理有效的归一化和分析。

结果

我们开发了 atacR,使非专业用户能够轻松分析他们的 ATAC 富集实验。它提供了全面的摘要功能和诊断图,用于研究富集标签的丰度。应用于样本间归一化的方法很简单。提供了基于用户定义的对照区域、整个文库大小以及从数据集最小变异性区域中选择的区域对标签进行归一化的功能。提供了三种从富集方法中检测标签丰度差异的方法,包括自举 t、贝叶斯因子和 edgeR 包中标准精确检验的包装版本。我们比较了在不同重复、显著阈值和基因变化的情况下,每种检测方法在重采样数据上的精度、召回率和 F 分数,发现贝叶斯因子方法的整体检测能力最强,尽管在基因变化较少的模拟中,edgeR 略强。

结论

我们的软件包允许非专业用户以可重复的方式轻松有效地应用适用于 ATAC-cap-seq 分析的方法。该软件包完全用 R 编写,与 Bioconductor 中的常见工作流程完全兼容。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32f7/6047409/362f05874722/giy080fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验