Crawley Alexandra B, Henriksen James R, Barrangou Rodolphe
1 Department of Food, Bioprocessing, and Nutrition Sciences, North Carolina State University , Raleigh, North Carolina.
2 AgBiome , Durham, North Carolina.
CRISPR J. 2018 Apr;1(2):171-181. doi: 10.1089/crispr.2017.0022. Epub 2018 Apr 9.
CRISPR-Cas adaptive immune systems of bacteria and archaea have catapulted into the scientific spotlight as genome editing tools. To aid researchers in the field, we have developed an automated pipeline, named CRISPRdisco (CRISPR discovery), to identify CRISPR repeats and genes in genome assemblies, determine type and subtype, and describe system completeness. All six major types and 23 currently recognized subtypes and novel putative V-U types are detected. Here, we use the pipeline to identify and classify putative CRISPR-Cas systems in 2,777 complete genomes from the NCBI RefSeq database. This allows comparison to previous publications and investigation of the occurrence and size of CRISPR-Cas systems. Software available at http://github.com/crisprlab/CRISPRdisco provides reproducible, standardized, accessible, transparent, and high-throughput analysis methods available to all researchers in and beyond the CRISPR-Cas research community. This tool opens new avenues to enable classification within a complex nomenclature and provides analytical methods in a field that has evolved rapidly.
细菌和古菌的CRISPR-Cas适应性免疫系统作为基因组编辑工具已迅速成为科学焦点。为帮助该领域的研究人员,我们开发了一个名为CRISPRdisco(CRISPR发现)的自动化流程,用于在基因组组装中识别CRISPR重复序列和基因,确定类型和亚型,并描述系统的完整性。所有六种主要类型以及23种目前公认的亚型和新型推定的V-U类型均可被检测到。在此,我们使用该流程对来自NCBI RefSeq数据库的2777个完整基因组中的推定CRISPR-Cas系统进行识别和分类。这使得我们能够与之前的出版物进行比较,并对CRISPR-Cas系统的出现情况和大小进行研究。可从http://github.com/crisprlab/CRISPRdisco获取的软件为CRISPR-Cas研究领域内外的所有研究人员提供了可重复、标准化、可访问、透明且高通量的分析方法。该工具开辟了新途径,以实现复杂命名法中的分类,并在一个快速发展的领域中提供分析方法。