Department of Biology, University of Konstanz, Konstanz, Germany.
BMC Evol Biol. 2010 Jul 30;10:233. doi: 10.1186/1471-2148-10-233.
The extended light-harvesting complex (LHC) protein superfamily is a centerpiece of eukaryotic photosynthesis, comprising the LHC family and several families involved in photoprotection, like the LHC-like and the photosystem II subunit S (PSBS). The evolution of this complex superfamily has long remained elusive, partially due to previously missing families.
In this study we present a meticulous search for LHC-like sequences in public genome and expressed sequence tag databases covering twelve representative photosynthetic eukaryotes from the three primary lineages of plants (Plantae): glaucophytes, red algae and green plants (Viridiplantae). By introducing a coherent classification of the different protein families based on both, hidden Markov model analyses and structural predictions, numerous new LHC-like sequences were identified and several new families were described, including the red lineage chlorophyll a/b-binding-like protein (RedCAP) family from red algae and diatoms. The test of alternative topologies of sequences of the highly conserved chlorophyll-binding core structure of LHC and PSBS proteins significantly supports the independent origins of LHC and PSBS families via two unrelated internal gene duplication events. This result was confirmed by the application of cluster likelihood mapping.
The independent evolution of LHC and PSBS families is supported by strong phylogenetic evidence. In addition, a possible origin of LHC and PSBS families from different homologous members of the stress-enhanced protein subfamily, a diverse and anciently paralogous group of two-helix proteins, seems likely. The new hypothesis for the evolution of the extended LHC protein superfamily proposed here is in agreement with the character evolution analysis that incorporates the distribution of families and subfamilies across taxonomic lineages. Intriguingly, stress-enhanced proteins, which are universally found in the genomes of green plants, red algae, glaucophytes and in diatoms with complex plastids, could represent an important and previously missing link in the evolution of the extended LHC protein superfamily.
扩展的光捕获复合物(LHC)蛋白超家族是真核光合作用的核心,包括 LHC 家族和几个参与光保护的家族,如 LHC 样和光系统 II 亚基 S(PSBS)。这个复杂的超家族的进化长期以来一直难以捉摸,部分原因是以前缺少家族。
在这项研究中,我们在涵盖植物(植物界)三个主要谱系的 12 个有代表性的光合真核生物的公共基因组和表达序列标签数据库中进行了细致的 LHC 样序列搜索。通过引入基于隐马尔可夫模型分析和结构预测的不同蛋白家族的一致分类,鉴定出了许多新的 LHC 样序列,并描述了几个新的家族,包括红藻和硅藻的红谱系叶绿素 a/b 结合蛋白(RedCAP)家族。对高度保守的 LHC 和 PSBS 蛋白叶绿素结合核心结构序列的替代拓扑结构的测试,显著支持了 LHC 和 PSBS 家族通过两个不相关的内部基因复制事件独立起源。该结果通过聚类似然映射的应用得到了证实。
强烈的系统发育证据支持 LHC 和 PSBS 家族的独立进化。此外,LHC 和 PSBS 家族可能起源于应激增强蛋白亚家族的不同同源成员,应激增强蛋白亚家族是一个多样化且古老的双螺旋蛋白同源基因家族。这里提出的扩展 LHC 蛋白超家族进化的新假说与包含家族和亚家族在分类谱系中的分布的特征进化分析一致。有趣的是,普遍存在于绿藻、红藻、蓝藻和具有复杂质体的硅藻基因组中的应激增强蛋白,可能代表了扩展 LHC 蛋白超家族进化中的一个重要且以前缺失的环节。