Univ Rennes, Inria, CNRS, IRISA, UMR 6074, Rennes, France.
Univ Rennes, Inserm, EHESP, Irset, UMR S1085, Rennes, France.
PLoS Comput Biol. 2023 Aug 31;19(8):e1011404. doi: 10.1371/journal.pcbi.1011404. eCollection 2023 Aug.
Numerous computational methods based on sequences or structures have been developed for the characterization of protein function, but they are still unsatisfactory to deal with the multiple functions of multi-domain protein families. Here we propose an original approach based on 1) the detection of conserved sequence modules using partial local multiple alignment, 2) the phylogenetic inference of species/genes/modules/functions evolutionary histories, and 3) the identification of co-appearances of modules and functions. Applying our framework to the multidomain ADAMTS-TSL family including ADAMTS (A Disintegrin-like and Metalloproteinase with ThromboSpondin motif) and ADAMTS-like proteins over nine species including human, we identify 45 sequence module signatures that are associated with the occurrence of 278 Protein-Protein Interactions in ancestral genes. Some of these signatures are supported by published experimental data and the others provide new insights (e.g. ADAMTS-5). The module signatures of ADAMTS ancestors notably highlight the dual variability of the propeptide and ancillary regions suggesting the importance of these two regions in the specialization of ADAMTS during evolution. Our analyses further indicate convergent interactions of ADAMTS with COMP and CCN2 proteins. Overall, our study provides 186 sequence module signatures that discriminate distinct subgroups of ADAMTS and ADAMTSL and that may result from selective pressures on novel functions and phenotypes.
已经开发出了许多基于序列或结构的计算方法来描述蛋白质的功能,但它们在处理多结构域蛋白家族的多种功能时仍然不尽如人意。在这里,我们提出了一种基于以下三个方面的新方法:1)使用部分局部多重比对检测保守序列模块;2)对物种/基因/模块/功能进化历史进行系统发育推断;3)鉴定模块和功能的共同出现。将我们的框架应用于包含 ADAMTS(一种解聚素样金属蛋白酶与血栓素基序)和 ADAMTS 样蛋白的多结构域 ADAMTS-TSL 家族,跨越包括人类在内的九个物种,我们确定了 45 个与 278 个蛋白-蛋白相互作用的发生相关的序列模块特征,这些特征出现在祖先基因中。其中一些特征得到了已发表的实验数据的支持,而其他特征则提供了新的见解(例如 ADAMTS-5)。ADAMTS 祖先的模块特征特别突出了前肽和辅助区的双重可变性,这表明这两个区域在 ADAMTS 在进化过程中的专业化过程中非常重要。我们的分析进一步表明,ADAMTS 与 COMP 和 CCN2 蛋白存在趋同相互作用。总的来说,我们的研究提供了 186 个区分 ADAMTS 和 ADAMTSL 不同亚群的序列模块特征,这些特征可能是由于对新功能和表型的选择压力导致的。