Suppr超能文献

GUIDEseq:一个用于分析CRISPR-Cas核酸酶的GUIDE-Seq数据集的Bioconductor软件包。

GUIDEseq: a bioconductor package to analyze GUIDE-Seq datasets for CRISPR-Cas nucleases.

作者信息

Zhu Lihua Julie, Lawrence Michael, Gupta Ankit, Pagès Hervé, Kucukural Alper, Garber Manuel, Wolfe Scot A

机构信息

Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, MA, USA.

Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA, USA.

出版信息

BMC Genomics. 2017 May 15;18(1):379. doi: 10.1186/s12864-017-3746-y.

Abstract

BACKGROUND

Genome editing technologies developed around the CRISPR-Cas9 nuclease system have facilitated the investigation of a broad range of biological questions. These nucleases also hold tremendous promise for treating a variety of genetic disorders. In the context of their therapeutic application, it is important to identify the spectrum of genomic sequences that are cleaved by a candidate nuclease when programmed with a particular guide RNA, as well as the cleavage efficiency of these sites. Powerful new experimental approaches, such as GUIDE-seq, facilitate the sensitive, unbiased genome-wide detection of nuclease cleavage sites within the genome. Flexible bioinformatics analysis tools for processing GUIDE-seq data are needed.

RESULTS

Here, we describe an open source, open development software suite, GUIDEseq, for GUIDE-seq data analysis and annotation as a Bioconductor package in R. The GUIDEseq package provides a flexible platform with more than 60 adjustable parameters for the analysis of datasets associated with custom nuclease applications. These parameters allow data analysis to be tailored to different nuclease platforms with different length and complexity in their guide and PAM recognition sequences or their DNA cleavage position. They also enable users to customize sequence aggregation criteria, and vary peak calling thresholds that can influence the number of potential off-target sites recovered. GUIDEseq also annotates potential off-target sites that overlap with genes based on genome annotation information, as these may be the most important off-target sites for further characterization. In addition, GUIDEseq enables the comparison and visualization of off-target site overlap between different datasets for a rapid comparison of different nuclease configurations or experimental conditions. For each identified off-target, the GUIDEseq package outputs mapped GUIDE-Seq read count as well as cleavage score from a user specified off-target cleavage score prediction algorithm permitting the identification of genomic sequences with unexpected cleavage activity.

CONCLUSION

The GUIDEseq package enables analysis of GUIDE-data from various nuclease platforms for any species with a defined genomic sequence. This software package has been used successfully to analyze several GUIDE-seq datasets. The software, source code and documentation are freely available at http://www.bioconductor.org/packages/release/bioc/html/GUIDEseq.html .

摘要

背景

围绕CRISPR-Cas9核酸酶系统开发的基因组编辑技术推动了对广泛生物学问题的研究。这些核酸酶在治疗多种遗传疾病方面也具有巨大潜力。在其治疗应用背景下,识别当用特定引导RNA编程时被候选核酸酶切割的基因组序列谱以及这些位点的切割效率非常重要。强大的新实验方法,如GUIDE-seq,有助于在全基因组范围内灵敏、无偏地检测核酸酶切割位点。需要灵活的生物信息学分析工具来处理GUIDE-seq数据。

结果

在此,我们描述了一个开源、开放开发的软件套件GUIDEseq,用于将GUIDE-seq数据作为R语言中的Bioconductor包进行分析和注释。GUIDEseq包提供了一个灵活的平台,具有60多个可调整参数,用于分析与定制核酸酶应用相关的数据集。这些参数允许根据引导和PAM识别序列或其DNA切割位置的不同长度和复杂性,针对不同核酸酶平台进行数据分析定制。它们还使用户能够自定义序列聚集标准,并改变可能影响回收的潜在脱靶位点数量的峰检测阈值。GUIDEseq还根据基因组注释信息注释与基因重叠的潜在脱靶位点,因为这些可能是进一步表征的最重要脱靶位点。此外,GUIDEseq能够比较和可视化不同数据集之间的脱靶位点重叠情况,以便快速比较不同核酸酶配置或实验条件。对于每个识别出的脱靶位点,GUIDEseq包输出映射的GUIDE-Seq读数计数以及来自用户指定的脱靶切割评分预测算法的切割评分,从而能够识别具有意外切割活性的基因组序列。

结论

GUIDEseq包能够对来自各种核酸酶平台的GUIDE数据进行分析,适用于任何具有明确基因组序列的物种。该软件包已成功用于分析多个GUIDE-seq数据集。该软件、源代码和文档可在http://www.bioconductor.org/packages/release/bioc/html/GUIDEseq.html免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ab8/5433024/488d9cc435b3/12864_2017_3746_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验