• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Domainator是一个用于基于结构域的注释和邻域分析的灵活软件套件,可识别参与抗病毒系统的蛋白质。

Domainator, a flexible software suite for domain-based annotation and neighborhood analysis, identifies proteins involved in antiviral systems.

作者信息

Johnson Sean R, Weigele Peter R, Fomenkov Alexey, Ge Andrew, Vincze Anna, Eaglesham James B, Roberts Richard J, Sun Zhiyi

机构信息

New England Biolabs Inc., Ipswich, MA 01938, USA.

出版信息

Nucleic Acids Res. 2025 Jan 11;53(2). doi: 10.1093/nar/gkae1175.

DOI:10.1093/nar/gkae1175
PMID:39657740
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11754643/
Abstract

The availability of large databases of biological sequences presents an opportunity for in-depth exploration of gene diversity and function. Bacterial defense systems are a rich source of diverse but difficult to annotate genes with biotechnological applications. In this work, we present Domainator, a flexible and modular software suite for domain-based gene neighborhood and protein search, extraction and clustering. We demonstrate the utility of Domainator through three examples related to bacterial defense systems. First, we cluster CRISPR-associated Rossman fold (CARF) containing proteins with difficult to annotate effector domains, classifying most of them as likely transcriptional regulators and a subset as likely RNases. Second, we extract and cluster P4-like phage satellite defense hotspots, identify an abundant variant of Lamassu defense systems and demonstrate its in vivo activity against several T-even phages. Third, we integrate a protein language model into Domainator and use it to identify restriction endonucleases with low similarity to known reference sequences, validating the activity of one example in vitro. Domainator is made available as an open-source package with detailed documentation and usage examples.

摘要

生物序列大型数据库的出现为深入探索基因多样性和功能提供了契机。细菌防御系统是具有生物技术应用价值但难以注释的多样基因的丰富来源。在这项工作中,我们展示了Domainator,这是一个灵活且模块化的软件套件,用于基于结构域的基因邻域和蛋白质搜索、提取及聚类。我们通过与细菌防御系统相关的三个例子展示了Domainator的实用性。首先,我们对含有难以注释的效应结构域的CRISPR相关罗斯曼折叠(CARF)蛋白进行聚类,将它们中的大多数分类为可能的转录调节因子,将一部分分类为可能的核糖核酸酶。其次,我们提取并聚类P4样噬菌体卫星防御热点,鉴定出拉玛苏防御系统的一种丰富变体,并证明其对几种T偶数噬菌体的体内活性。第三,我们将蛋白质语言模型集成到Domainator中,并使用它来鉴定与已知参考序列相似度低的限制性内切酶,在体外验证了一个例子的活性。Domainator作为一个开源软件包提供,带有详细的文档和使用示例。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/0a8a6499586f/gkae1175fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/a2a766a24122/gkae1175figgra1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/a68c8abc4e06/gkae1175fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/58795be5274d/gkae1175fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/1ae8948231c3/gkae1175fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/ece27d2d8bff/gkae1175fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/3e73dd0c09fb/gkae1175fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/0a8a6499586f/gkae1175fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/a2a766a24122/gkae1175figgra1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/a68c8abc4e06/gkae1175fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/58795be5274d/gkae1175fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/1ae8948231c3/gkae1175fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/ece27d2d8bff/gkae1175fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/3e73dd0c09fb/gkae1175fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/347d/11754643/0a8a6499586f/gkae1175fig6.jpg

相似文献

1
Domainator, a flexible software suite for domain-based annotation and neighborhood analysis, identifies proteins involved in antiviral systems.Domainator是一个用于基于结构域的注释和邻域分析的灵活软件套件,可识别参与抗病毒系统的蛋白质。
Nucleic Acids Res. 2025 Jan 11;53(2). doi: 10.1093/nar/gkae1175.
2
Comprehensive search for accessory proteins encoded with archaeal and bacterial type III CRISPR-cas gene cassettes reveals 39 new cas gene families.全面搜索具有古菌和细菌 III 型 CRISPR-Cas 基因盒编码的辅助蛋白,揭示了 39 个新的 Cas 基因家族。
RNA Biol. 2019 Apr;16(4):530-542. doi: 10.1080/15476286.2018.1483685. Epub 2018 Jun 19.
3
Covalent Modifications of the Bacteriophage Genome Confer a Degree of Resistance to Bacterial CRISPR Systems.噬菌体基因组的共价修饰赋予了细菌 CRISPR 系统一定程度的抗性。
J Virol. 2020 Nov 9;94(23). doi: 10.1128/JVI.01630-20.
4
The SAVED domain of the type III CRISPR protease CalpL is a ring nuclease.III 型 CRISPR 蛋白酶 CalpL 的 SAVED 结构域是一种环核酶。
Nucleic Acids Res. 2024 Sep 23;52(17):10520-10532. doi: 10.1093/nar/gkae676.
5
A phage satellite tunes inducing phage gene expression using a domesticated endonuclease to balance inhibition and virion hijacking.一种噬菌体卫星利用一种驯化的内切酶来调节诱导噬菌体基因表达,以平衡抑制和病毒劫持。
Nucleic Acids Res. 2021 May 7;49(8):4386-4401. doi: 10.1093/nar/gkab207.
6
Anti-CRISPR proteins: Counterattack of phages on bacterial defense (CRISPR/Cas) system.抗CRISPR蛋白:噬菌体对细菌防御(CRISPR/Cas)系统的反击
J Cell Physiol. 2018 Jan;233(1):57-59. doi: 10.1002/jcp.25877. Epub 2017 May 8.
7
Cas13d Is a Compact RNA-Targeting Type VI CRISPR Effector Positively Modulated by a WYL-Domain-Containing Accessory Protein.Cas13d 是一种紧凑型 RNA 靶向的 VI 型 CRISPR 效应蛋白,其活性受到含有 WYL 结构域的辅助蛋白的正向调节。
Mol Cell. 2018 Apr 19;70(2):327-339.e5. doi: 10.1016/j.molcel.2018.02.028. Epub 2018 Mar 15.
8
Phages and their satellites encode hotspots of antiviral systems.噬菌体及其卫星编码抗病毒系统的热点。
Cell Host Microbe. 2022 May 11;30(5):740-753.e5. doi: 10.1016/j.chom.2022.02.018. Epub 2022 Mar 21.
9
GOPhage: protein function annotation for bacteriophages by integrating the genomic context.GOPhage:通过整合基因组背景对噬菌体进行蛋白质功能注释。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf014.
10
Exploring the diversity of anti-defense systems across prokaryotes, phages and mobile genetic elements.探索原核生物、噬菌体和移动遗传元件中抗防御系统的多样性。
Nucleic Acids Res. 2025 Jan 7;53(1). doi: 10.1093/nar/gkae1171.

引用本文的文献

1
Comprehensive nucleoside analysis of archaeal RNA modification profiles reveals an mG in the conserved P loop of 23S rRNA.古细菌RNA修饰谱的综合核苷分析揭示了23S rRNA保守P环中的一个mG。
Cell Rep. 2025 Apr 22;44(4):115471. doi: 10.1016/j.celrep.2025.115471. Epub 2025 Mar 24.

本文引用的文献

1
Bilingual language model for protein sequence and structure.用于蛋白质序列和结构的双语语言模型。
NAR Genom Bioinform. 2024 Nov 15;6(4):lqae150. doi: 10.1093/nargab/lqae150. eCollection 2024 Dec.
2
Protein structure alignment by Reseek improves sensitivity to remote homologs.Reseek 通过蛋白质结构比对提高了对远程同源物的灵敏度。
Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae687.
3
Sensitive remote homology search by local alignment of small positional embeddings from protein language models.通过蛋白质语言模型的小位置嵌入进行局部比对实现敏感的远程同源性搜索。
Elife. 2024 Mar 15;12:RP91415. doi: 10.7554/eLife.91415.
4
Type III CRISPR-Cas: beyond the Cas10 effector complex.III 型 CRISPR-Cas:超越 Cas10 效应物复合物。
Trends Biochem Sci. 2024 Jan;49(1):28-37. doi: 10.1016/j.tibs.2023.10.006. Epub 2023 Nov 8.
5
EFI-EST, EFI-GNT, and EFI-CGFP: Enzyme Function Initiative (EFI) Web Resource for Genomic Enzymology Tools.EFI-EST、EFI-GNT 和 EFI-CGFP:基因组酶学工具的酶功能倡议 (EFI) 网络资源。
J Mol Biol. 2023 Jul 15;435(14):168018. doi: 10.1016/j.jmb.2023.168018. Epub 2023 Feb 17.
6
Fast and accurate protein structure search with Foldseek.使用 Foldseek 进行快速准确的蛋白质结构搜索。
Nat Biotechnol. 2024 Feb;42(2):243-246. doi: 10.1038/s41587-023-01773-0. Epub 2023 May 8.
7
dbCAN3: automated carbohydrate-active enzyme and substrate annotation.dbCAN3:自动化碳水化合物活性酶和底物注释。
Nucleic Acids Res. 2023 Jul 5;51(W1):W115-W121. doi: 10.1093/nar/gkad328.
8
PyHMMER: a Python library binding to HMMER for efficient sequence analysis.PyHMMER:一个绑定到 HMMER 的 Python 库,用于高效的序列分析。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad214.
9
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
10
cblaster: a remote search tool for rapid identification and visualization of homologous gene clusters.cblaster:一种用于快速识别和可视化同源基因簇的远程搜索工具。
Bioinform Adv. 2021 Aug 5;1(1):vbab016. doi: 10.1093/bioadv/vbab016. eCollection 2021.