Universidade Estadual de Santa Cruz (UESC), Departamento de Ciências Biológicas (DCB), Centro de Biotecnologia e Genética (CBG), Rodovia Ilhéus-Itabuna km 16, 45662-900 Ilhéus, BA, Brazil.
Universidade Estadual de Santa Cruz (UESC), Departamento de Ciências Biológicas (DCB), Centro de Biotecnologia e Genética (CBG), Rodovia Ilhéus-Itabuna km 16, 45662-900 Ilhéus, BA, Brazil; CIRAD, UMR AGAP, F-34398 Montpellier, France.
Genomics. 2020 May;112(3):2666-2676. doi: 10.1016/j.ygeno.2020.03.001. Epub 2020 Mar 3.
In plant-pathogen interactions, plant immunity through pathogen-associated molecular pattern receptors (PAMPs) and R proteins, also called pattern recognition receptors (PRRs), occurs in different ways depending on both plant and pathogen species. The use and search for a structural pattern based on the presence and absence of characteristic domains, regardless of their disposition within a sequence, could be efficient in identifying PRRs proteins. Here, we develop a method mainly based on text mining and set theory to identify PRR and R genes that classify them into 13 categories based on the presence and absence of the main domains. Analyzing 24 plant and algae genomes, we showed that the RRGPredictor was more efficient, specific and sensitive than other tools already available, and identified PRR proteins with variations in size and in domain distribution throughout the sequence. Besides an easy identification of new plant PRRs proteins, RRGPredictor provided a low computational cost.
在植物-病原体相互作用中,植物通过病原体相关分子模式受体 (PAMPs) 和 R 蛋白的免疫,也称为模式识别受体 (PRRs),根据植物和病原体的种类而以不同的方式发生。基于存在和不存在特征结构域的结构模式的使用和搜索,而不考虑它们在序列中的位置,可能在识别 PRR 蛋白方面是有效的。在这里,我们开发了一种主要基于文本挖掘和集合论的方法来识别 PRR 和 R 基因,并根据主要结构域的存在和缺失将它们分类为 13 类。分析 24 种植物和藻类基因组后,我们表明 RRGPredictor 比其他现有工具更有效、更特异、更敏感,并识别了具有大小和结构域分布变化的 PRR 蛋白。除了易于识别新的植物 PRR 蛋白外,RRGPredictor 还具有较低的计算成本。