Zhou Jin, Zhang Yao-Yang, Li Qing-Yun, Cai Zhong-Hua
1. The Division of Ocean Science & Technology, Graduate School at Shenzhen, Tsinghua University, Shenzhen, 518055, P. R. China ; 2. Shenzhen Public Platform of Screening & Application of Marine Microbial Resources, Graduate School at Shenzhen, Tsinghua University, Shenzhen, 518055, P. R. China ; 3. Shenzhen Key Laboratory for Coastal Ocean Dynamic and Environment, Graduate School at Shenzhen, Tsinghua University, Shenzhen, 518055, P. R. China.
4. School of Life Science, Tsinghua University, Beijing, 100084, P. R. China.
Int J Biol Sci. 2015 Jul 14;11(9):1016-25. doi: 10.7150/ijbs.11751. eCollection 2015.
Cathepsin L family, an important cysteine protease found in lysosomes, is categorized into cathepsins B, F, H, K, L, S, and W in vertebrates. This categorization is based on their sequence alignment and traditional functional classification, but the evolutionary relationship of family members is unclear. This study determined the evolutionary relationship of cathepsin L family genes in vertebrates through phylogenetic construction. Results showed that cathepsins F, H, S and K, and L and V were chronologically diverged. Tandem-repeat duplication was found to occur in the evolutionary history of cathepsin L family. Cathepsin L in zebrafish, cathepsins S and K in xenopus, and cathepsin L in mice and rats underwent evident tandem-repeat events. Positive selection was detected in cathepsin L-like members in mice and rats, and amino acid sites under positive selection pressure were calculated. Most of these sites appeared at the connection of secondary structures, suggesting that the sites may slightly change spatial structure. Severe positive selection was also observed in cathepsin V (L2) of primates, indicating that this enzyme had some special functions. Our work provided a brief evolutionary history of cathepsin L family and differentiated cathepsins S and K from cathepsin L based on vertebrate appearance. Positive selection was the specific cause of differentiation of cathepsin L family genes, confirming that gene function variation after expansion events was related to interactions with the environment and adaptability.
组织蛋白酶L家族是一种在溶酶体中发现的重要半胱氨酸蛋白酶,在脊椎动物中可分为组织蛋白酶B、F、H、K、L、S和W。这种分类是基于它们的序列比对和传统功能分类,但家族成员的进化关系尚不清楚。本研究通过系统发育构建确定了脊椎动物中组织蛋白酶L家族基因的进化关系。结果表明,组织蛋白酶F、H、S和K,以及L和V是按时间顺序分化的。发现串联重复复制发生在组织蛋白酶L家族的进化历史中。斑马鱼中的组织蛋白酶L、非洲爪蟾中的组织蛋白酶S和K,以及小鼠和大鼠中的组织蛋白酶L都经历了明显的串联重复事件。在小鼠和大鼠的组织蛋白酶L样成员中检测到正选择,并计算了处于正选择压力下的氨基酸位点。这些位点大多出现在二级结构的连接处,表明这些位点可能会轻微改变空间结构。在灵长类动物的组织蛋白酶V(L2)中也观察到强烈的正选择,表明这种酶具有一些特殊功能。我们的工作提供了组织蛋白酶L家族的简要进化历史,并根据脊椎动物的出现情况将组织蛋白酶S和K与组织蛋白酶L区分开来。正选择是组织蛋白酶L家族基因分化的具体原因,证实了扩增事件后基因功能的变化与与环境的相互作用和适应性有关。