Suppr超能文献

MacSyFinder:一个用于挖掘基因组中分子系统的程序及其在CRISPR-Cas系统中的应用

MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.

作者信息

Abby Sophie S, Néron Bertrand, Ménager Hervé, Touchon Marie, Rocha Eduardo P C

机构信息

Microbial Evolutionary Genomics, Institut Pasteur, Paris, France; UMR3525, CNRS, Paris, France.

Centre d'Informatique pour la Biologie, Institut Pasteur, Paris, France.

出版信息

PLoS One. 2014 Oct 17;9(10):e110726. doi: 10.1371/journal.pone.0110726. eCollection 2014.

Abstract

MOTIVATION

Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose.

RESULTS

Macromolecular System Finder (MacSyFinder) provides a flexible framework to model the properties of molecular systems (cellular machinery or pathway) including their components, evolutionary associations with other systems and genetic architecture. Modelled features also include functional analogs, and the multiple uses of a same component by different systems. Models are used to search for molecular systems in complete genomes or in unstructured data like metagenomes. The components of the systems are searched by sequence similarity using Hidden Markov model (HMM) protein profiles. The assignment of hits to a given system is decided based on compliance with the content and organization of the system model. A graphical interface, MacSyView, facilitates the analysis of the results by showing overviews of component content and genomic context. To exemplify the use of MacSyFinder we built models to detect and class CRISPR-Cas systems following a previously established classification. We show that MacSyFinder allows to easily define an accurate "Cas-finder" using publicly available protein profiles.

AVAILABILITY AND IMPLEMENTATION

MacSyFinder is a standalone application implemented in Python. It requires Python 2.7, Hmmer and makeblastdb (version 2.2.28 or higher). It is freely available with its source code under a GPLv3 license at https://github.com/gem-pasteur/macsyfinder. It is compatible with all platforms supporting Python and Hmmer/makeblastdb. The "Cas-finder" (models and HMM profiles) is distributed as a compressed tarball archive as Supporting Information.

摘要

动机

生物学家常常希望利用他们对给定分子系统的少数实验模型的了解,在基因组数据中识别同源物。为此我们开发了一种通用工具。

结果

大分子系统查找器(MacSyFinder)提供了一个灵活的框架,用于对分子系统(细胞机制或途径)的特性进行建模,包括其组成部分、与其他系统的进化关联以及遗传结构。建模的特征还包括功能类似物,以及同一组件在不同系统中的多种用途。模型用于在完整基因组或元基因组等非结构化数据中搜索分子系统。使用隐马尔可夫模型(HMM)蛋白质谱通过序列相似性搜索系统的组成部分。根据与系统模型的内容和组织的符合程度,将命中结果分配给给定系统。图形界面MacSyView通过显示组件内容和基因组背景的概述,便于对结果进行分析。为了举例说明MacSyFinder的使用,我们构建了模型,按照先前建立的分类来检测和分类CRISPR-Cas系统。我们表明,MacSyFinder允许使用公开可用的蛋白质谱轻松定义一个准确的“Cas查找器”。

可用性和实现方式

MacSyFinder是一个用Python实现的独立应用程序。它需要Python 2.7、Hmmer和makeblastdb(版本2.2.28或更高)。它在https://github.com/gem-pasteur/macsyfinder上以GPLv3许可免费提供其源代码。它与所有支持Python和Hmmer/makeblastdb的平台兼容。“Cas查找器”(模型和HMM谱)作为支持信息以压缩的tarball存档形式分发。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a75/4201578/d297f1bc1af4/pone.0110726.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验