Suppr超能文献

CLOCI:利用广义检测揭示隐匿真菌基因簇。

CLOCI: unveiling cryptic fungal gene clusters with generalized detection.

机构信息

Department of Plant Pathology, The Ohio State University, Columbus, OH 43210, USA.

Center for Applied Plant Sciences, The Ohio State University, Columbus, OH 43210, USA.

出版信息

Nucleic Acids Res. 2024 Sep 9;52(16):e75. doi: 10.1093/nar/gkae625.

Abstract

Gene clusters are genomic loci that contain multiple genes that are functionally and genetically linked. Gene clusters collectively encode diverse functions, including small molecule biosynthesis, nutrient assimilation, metabolite degradation, and production of proteins essential for growth and development. Identifying gene clusters is a powerful tool for small molecule discovery and provides insight into the ecology and evolution of organisms. Current detection algorithms focus on canonical 'core' biosynthetic functions many gene clusters encode, while overlooking uncommon or unknown cluster classes. These overlooked clusters are a potential source of novel natural products and comprise an untold portion of overall gene cluster repertoires. Unbiased, function-agnostic detection algorithms therefore provide an opportunity to reveal novel classes of gene clusters and more precisely define genome organization. We present CLOCI (Co-occurrence Locus and Orthologous Cluster Identifier), an algorithm that identifies gene clusters using multiple proxies of selection for coordinated gene evolution. Our approach generalizes gene cluster detection and gene cluster family circumscription, improves detection of multiple known functional classes, and unveils non-canonical gene clusters. CLOCI is suitable for genome-enabled small molecule mining, and presents an easily tunable approach for delineating gene cluster families and homologous loci.

摘要

基因簇是基因组中的多个基因的集合,这些基因在功能和遗传上是相关联的。基因簇共同编码多种功能,包括小分子生物合成、营养物质吸收、代谢物降解以及生长和发育所必需的蛋白质的产生。识别基因簇是小分子发现的有力工具,并为生物体的生态和进化提供了深入的了解。目前的检测算法主要关注许多基因簇编码的典型“核心”生物合成功能,而忽略了不常见或未知的簇类。这些被忽视的簇类是新型天然产物的潜在来源,构成了整体基因簇库的未知部分。无偏见、不依赖功能的检测算法因此提供了一个揭示新的基因簇类别的机会,并更精确地定义基因组组织。我们提出了 CLOCI(共现基因座和同源簇标识符),这是一种使用多个选择代理来识别基因簇的算法,这些选择代理用于协调基因进化。我们的方法推广了基因簇的检测和基因簇家族的划定,提高了对多个已知功能类别的检测能力,并揭示了非典型的基因簇。CLOCI 适用于基于基因组的小分子挖掘,并提供了一种易于调整的方法来划定基因簇家族和同源基因座。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07ba/11381361/cc6b81ac57c2/gkae625figgra1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验