Suppr超能文献

新型硒蛋白邻域提示了特定的生化过程。

Novel selenoprotein neighborhoods suggest specialized biochemical processes.

作者信息

Haft Daniel H, Tolstoy Igor

机构信息

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.

Independent Researcher, Bethesda, Maryland, USA.

出版信息

mSystems. 2025 Apr 22;10(4):e0141724. doi: 10.1128/msystems.01417-24. Epub 2025 Mar 31.

Abstract

UNLABELLED

Prokaryotic genomes encode selenoproteins sparsely, roughly one protein per 5,000. Finding novel selenoprotein families can expose unknown biological processes that are enabled, or at least enhanced, by having a selenium atom replace a sulfur atom in some cysteine residues. Here, we report the discovery of 18 novel selenoprotein families or second selenocysteine sites in previously unrecognized extensions of protein translations. Most of these families had some confounding factors-too small a family, too few selenoproteins in the family, selenocysteine (U) too close to one end, a skew toward understudied or uncultured lineages, and consequently were missed previously. Discoveries were triggered by observations during the ongoing construction of protein family models for the National Center for Biotechnology Information's RefSeq and Prokaryotic Gene Annotation Pipeline or made by targeted searches for novel selenoproteins in the vicinity of known ones, rather than by any broadly applied genome mining method. Unrelated families TsoA, TsoB, TsoC, and TsoX are adjacent in (three selenoprotein operon) loci in the bacterial phylum . TrsS (third radical SAM selenoprotein) occurs strictly in the context of a molybdopterin-dependent aldehyde oxidoreductase. A short carboxy-terminal motif, U-X-X-stop (UXX-star), occurs in selenoproteins with various architectures, usually providing the second U in the protein. The multiple new selenocysteine insertion sites, selenoprotein families, and selenium-dependent operons we curated manually suggest that many more proteins and pathways remain to be discovered; once improved computational methods are applied comprehensively to the latest collections of microbial genomes and metagenomes, they may reveal surprising new biochemical processes.

IMPORTANCE

Next-generation DNA sequencing and assembly of metagenome-assembled genomes (MAGs) for uncultured species of various microbiomes adds a vast "dark matter" of hard-to-decipher protein sequences. Selenoproteins, optimized by natural selection to encode selenocysteine where cysteine might have been encoded much more easily, carry a strong clue to their function-some specialized aspect of binding or catalysis. Operons with multiple adjacent, but otherwise unrelated, selenoproteins should provide even more vivid information. In this study, efforts in protein family construction and curation, aimed at improving the PGAP genome annotation pipeline, generated multiple novel selenoprotein-containing genomic contexts that may lead to the future characterization of several systems of proteins. Past observations suggest roles in the metabolic handling of trace elements (mercury, tungsten, arsenic, etc.) or of organic compounds refractory to simpler enzymatic pathways. In addition, the work significantly expands the truth set of validated selenoproteins, which should aid future, more automated genome mining efforts.

摘要

未标注

原核生物基因组中硒蛋白编码稀疏,大约每5000个蛋白质中有一个。发现新的硒蛋白家族可以揭示未知的生物学过程,这些过程通过在某些半胱氨酸残基中用硒原子取代硫原子而得以实现,或至少得到增强。在此,我们报告在先前未识别的蛋白质翻译延伸中发现了18个新的硒蛋白家族或第二个硒代半胱氨酸位点。这些家族中的大多数都存在一些混杂因素——家族规模太小、家族中硒蛋白数量太少、硒代半胱氨酸(U)离一端太近、偏向于研究较少或未培养的谱系,因此之前被遗漏了。这些发现是由在为美国国立生物技术信息中心的RefSeq和原核生物基因注释管道构建蛋白质家族模型的过程中的观察结果触发的,或者是通过在已知硒蛋白附近有针对性地搜索新的硒蛋白而做出的,而不是通过任何广泛应用的基因组挖掘方法。不相关的家族TsoA、TsoB、TsoC和TsoX在细菌门的(三个硒蛋白操纵子)位点中相邻。TrsS(第三个自由基SAM硒蛋白)严格出现在依赖钼蝶呤的醛氧化还原酶的背景中。一个短的羧基末端基序,U-X-X-stop(UXX-star),出现在具有各种结构的硒蛋白中,通常为蛋白质提供第二个U。我们手动整理的多个新的硒代半胱氨酸插入位点、硒蛋白家族和硒依赖性操纵子表明,还有更多的蛋白质和途径有待发现;一旦将改进的计算方法全面应用于最新的微生物基因组和宏基因组集合,它们可能会揭示出令人惊讶的新生化过程。

重要性

下一代DNA测序以及对各种微生物群落中未培养物种的宏基因组组装基因组(MAG)的组装增加了大量难以解读的蛋白质序列的“暗物质”。硒蛋白通过自然选择进行优化,以便在可能更容易编码半胱氨酸的地方编码硒代半胱氨酸,这为其功能——结合或催化的某些特殊方面——提供了有力线索。具有多个相邻但其他方面不相关的硒蛋白的操纵子应该会提供更生动的信息。在这项研究中,旨在改进PGAP基因组注释管道的蛋白质家族构建和整理工作产生了多个新的含硒蛋白的基因组背景,这可能会导致未来对几个蛋白质系统的表征。过去的观察表明其在微量元素(汞、钨、砷等)或对更简单酶促途径具有抗性的有机化合物的代谢处理中发挥作用。此外,这项工作显著扩展了经过验证的硒蛋白的真值集,这应该有助于未来更自动化的基因组挖掘工作。

相似文献

1
4
Chemical Biology Approaches to Interrogate the Selenoproteome.化学生物学方法探究硒蛋白组。
Acc Chem Res. 2019 Oct 15;52(10):2832-2840. doi: 10.1021/acs.accounts.9b00379. Epub 2019 Sep 16.
7
The microbial selenoproteome of the Sargasso Sea.马尾藻海的微生物硒蛋白组
Genome Biol. 2005;6(4):R37. doi: 10.1186/gb-2005-6-4-r37. Epub 2005 Mar 29.

本文引用的文献

3
Unraveling the functional dark matter through global metagenomics.通过全球宏基因组学揭示功能暗物质。
Nature. 2023 Oct;622(7983):594-602. doi: 10.1038/s41586-023-06583-7. Epub 2023 Oct 11.
4
Biological and Catalytic Properties of Selenoproteins.硒蛋白的生物学和催化特性。
Int J Mol Sci. 2023 Jun 14;24(12):10109. doi: 10.3390/ijms241210109.
6
Interactive Analysis of Functional Residues in Protein Families.蛋白质家族功能残基的交互分析。
mSystems. 2022 Dec 20;7(6):e0070522. doi: 10.1128/msystems.00705-22. Epub 2022 Nov 14.
10
Pfam: The protein families database in 2021.Pfam:2021 年的蛋白质家族数据库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419. doi: 10.1093/nar/gkaa913.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验