Department of Microbiology, University of La Laguna, La Laguna, Spain.
Systems Biology Program, Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas, Madrid, Spain.
ISME J. 2019 May;13(5):1183-1197. doi: 10.1038/s41396-019-0347-6. Epub 2019 Jan 14.
Dimethylsulfoniopropionate (DMSP) is produced mainly by phytoplankton and bacteria. It is relatively abundant and ubiquitous in the marine environment, where bacterioplankton make use of it readily as both carbon and sulfur sources. In one transformation pathway, part of the molecule becomes dimethylsulfide (DMS), which escapes into the atmosphere and plays an important role in the sulfur exchange between oceans and atmosphere. Through its other dominant catabolic pathway, bacteria are able to use it as sulfur source. During the past few years, a number of genes involved in its transformation have been characterized. Identifying genes in taxonomic groups not amenable to conventional methods of cultivation is challenging. Indeed, functional annotation of genes in environmental studies is not straightforward, considering that particular taxa are not well represented in the available sequence databases. Furthermore, many genes belong to families of paralogs with similar sequences but perhaps different functions. In this study, we develop in silico approaches to infer protein function of an environmentally important gene (dmdA) that carries out the first step in the sulfur assimilation from DMSP. The method combines a set of tools to annotate a targeted gene in genome databases and metagenome assemblies. The method will be useful to identify genes that carry out key biochemical processes in the environment.
二甲基巯丙酸酯 (DMSP) 主要由浮游植物和细菌产生。它在海洋环境中相对丰富且普遍存在,其中细菌很容易将其用作碳和硫的来源。在一种转化途径中,该分子的一部分变成二甲硫 (DMS),它逸入大气中,并在海洋和大气之间的硫交换中发挥重要作用。通过其另一种主要的分解代谢途径,细菌能够将其用作硫源。在过去的几年中,已经对参与其转化的许多基因进行了表征。鉴定无法通过常规培养方法处理的分类群中的基因具有挑战性。实际上,考虑到特定类群在可用的序列数据库中没有很好的代表,环境研究中的基因功能注释并不简单。此外,许多基因属于具有相似序列但功能可能不同的旁系同源基因家族。在这项研究中,我们开发了一种计算方法来推断执行 DMSP 中硫同化第一步的环境重要基因 (dmdA) 的蛋白质功能。该方法结合了一组工具,可在基因组数据库和宏基因组组装中注释目标基因。该方法将有助于识别在环境中执行关键生化过程的基因。