CNRS, Centrale Marseille, iSm2, Aix Marseille Univ, Marseille, France.
BMC Bioinformatics. 2022 Aug 2;23(1):313. doi: 10.1186/s12859-022-04832-6.
DIRs are mysterious protein that have the ability to scavenge free radicals, which, are highly reactive with molecules in their vicinity. What is even more fascinating is that they carry out from these highly unstable species, a selective reaction (i.e., stereoenantioselective) from a well-defined substrate to give a very precise product. Unfortunately, to date, only three products have been demonstrated following studies on DIRs from the plant world, which until now was the kingdom where these proteins had been demonstrated. Within this kingdom, each DIR protein has its own type of substrate. The products identified to date, have on the other hand, a strong economic impact: in agriculture for example, the biosynthesis of (+)-gossypol could be highlighted (a repellent antifood produced by the cotton plant) by the DIRs of cotton. In forsythia plant species, it is the biosynthesis of (-)-pinoresinol, an intermediate leading to the synthesis of podophyllotoxine (a powerful anicancerous agent) which has been revealed. Recently, a clear path of study, potentially with strong impact, appeared by the hypothesis of the potential existence of protein DIR within the genomes of prokaryotes. The possibility of working with this type of organism is an undeniable advantage: since many sequenced genomes are available and the molecular tools are already developed. Even easier to implement and working on microbes, of less complex composition, offers many opportunities for laboratory studies. On the other hand, the diversity of their environment (e.g., soil, aquatic environments, extreme environmental conditions (pH, temperature, pressure) make them very diverse and varied subjects of study. Identifying new DIR proteins from bacteria means identifying new substrate or product molecules from these organisms. It is the promise of going further in understanding the mechanism of action of these proteins and this will most likely have a strong impact in the fields of agricultural, pharmaceutical and/or food chemistry.
Our goal is to obtain as much information as possible about these proteins to unlock the secrets of their exceptional functioning. Analyzes of structural and functional genomic data led to the identification of the Pfam PF03018 domain as characteristic of DIR proteins. This domain has been further identified in the sequence of bacterial proteins therefore named as DIR-like (DIRL). We have chosen a multidisciplinary bioinformatic approach centered on bacterial genome identification, gene expression and regulation signals, protein structures, and their molecular information content. The objective of this study was to perform a thorough bioinformatic analysis on these DIRLs to highlight any information leading to the selection of candidate bacteria for further cloning, purification, and characterization of bacterial DIRs.
From studies of DIRL genes identification, primary structures, predictions of their secondary and tertiary structures, prediction of DIRL signals sequences, analysis of their gene organization and potential regulation, a list of primary bacterial candidates is proposed.
DIR 是一种神秘的蛋白质,具有清除自由基的能力,自由基是与其附近分子高度反应的物质。更令人着迷的是,它们从这些高度不稳定的物质中进行选择性反应(即立体对映选择性),从明确的底物中得到非常精确的产物。不幸的是,迄今为止,仅在植物界的 DIR 研究中就已经证明了三种产物,而到目前为止,这些蛋白质一直存在于这个领域。在这个王国中,每种 DIR 蛋白质都有自己类型的底物。另一方面,迄今为止已确定的产物具有很强的经济影响:例如,在农业中,可以通过棉花的 DIR 来强调(棉花植物产生的驱虫剂)(+)-棉酚的生物合成。在连翘植物物种中,揭示了(-)-松脂醇的生物合成,这是一种中间产物,可导致合成鬼臼毒素(一种强大的抗癌剂)。最近,通过假设原核生物基因组中可能存在蛋白质 DIR,出现了一条具有潜在影响的明确研究途径。与这种类型的生物体合作是一个不可否认的优势:因为有许多已测序的基因组可用,并且已经开发了分子工具。在微生物上更容易实施和研究,它们的组成更简单,为实验室研究提供了许多机会。另一方面,它们的环境多样性(例如土壤、水生环境、极端环境条件(pH 值、温度、压力)使它们成为非常多样化和多样化的研究对象。从细菌中鉴定新的 DIR 蛋白质意味着从这些生物体中鉴定新的底物或产物分子。这是进一步了解这些蛋白质作用机制的承诺,这很可能在农业、制药和/或食品化学领域产生重大影响。
我们的目标是尽可能多地获取有关这些蛋白质的信息,以揭开它们特殊功能的秘密。结构和功能基因组数据分析导致了 Pfam PF03018 结构域的鉴定,该结构域进一步在细菌蛋白序列中被鉴定,因此被命名为 DIR 样(DIRL)。我们选择了一种多学科的生物信息学方法,该方法以细菌基因组鉴定、基因表达和调控信号、蛋白质结构及其分子信息含量为中心。本研究的目的是对这些 DIRLs 进行全面的生物信息学分析,以突出任何可能导致选择候选细菌进行进一步克隆、纯化和表征细菌 DIRs 的信息。
通过 DIRL 基因鉴定、一级结构、二级和三级结构预测、DIRL 信号序列预测、基因组织和潜在调控分析,提出了一组主要的细菌候选物。