1Organismal and Evolutionary Biology Research Programme, Viikki Plant Science Centre, VIPS, Faculty of Biological and Environmental Sciences, University of Helsinki, Viikinkaari 1 (POB65), FI-00014 Helsinki, Finland.
2Structural Plant Biology Laboratory, Department of Botany and Plant Biology, University of Geneva, Geneva, Switzerland.
Commun Biol. 2019 Feb 8;2:56. doi: 10.1038/s42003-019-0306-9. eCollection 2019.
Large protein families are a prominent feature of plant genomes and their size variation is a key element for adaptation. However, gene and genome duplications pose difficulties for functional characterization and translational research. Here we infer the evolutionary history of the DOMAIN OF UNKNOWN FUNCTION (DUF) 26-containing proteins. The DUF26 emerged in secreted proteins. Domain duplications and rearrangements led to the appearance of CYSTEINE-RICH RECEPTOR-LIKE PROTEIN KINASES (CRKs) and PLASMODESMATA-LOCALIZED PROTEINS (PDLPs). The DUF26 is land plant-specific but structural analyses of PDLP ectodomains revealed strong similarity to fungal lectins and thus may constitute a group of plant carbohydrate-binding proteins. CRKs expanded through tandem duplications and preferential retention of duplicates following whole genome duplications, whereas PDLPs evolved according to the dosage balance hypothesis. We propose that new gene families mainly expand through small-scale duplications, while fractionation and genetic drift after whole genome multiplications drive families towards dosage balance.
大型蛋白质家族是植物基因组的显著特征,其大小变化是适应的关键因素。然而,基因和基因组的复制给功能特征和转化研究带来了困难。在这里,我们推断了含有未知功能域(DUF)26 的蛋白质的进化历史。DUF26 出现在分泌蛋白中。结构域的重复和重排导致了富含半胱氨酸的受体样蛋白激酶(CRKs)和质膜定位蛋白(PDLPs)的出现。DUF26 是陆地植物特有的,但 PDLP 外显子结构分析显示与真菌凝集素有很强的相似性,因此可能构成了一组植物碳水化合物结合蛋白。CRKs 通过串联重复扩张,并在全基因组重复后优先保留重复,而 PDLPs 则根据剂量平衡假说进化。我们提出,新的基因家族主要通过小规模的重复扩张,而全基因组多次复制后的分裂和遗传漂变则使家族朝着剂量平衡的方向发展。