Zhou Jingqi, Liu Dangyun, Sa Zhining, Huang Wei, Zou Yangyun, Gu Xun
State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai 200433, PR China.
State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai 200433, PR China.
Mol Phylogenet Evol. 2017 Aug;113:126-138. doi: 10.1016/j.ympev.2017.05.010. Epub 2017 May 12.
One of hot research foci has always been predicting amino acid residues underlying functional divergence after gene duplication, as those predicted sites can be used as candidates for further functional experimentations. It is important and interesting to know how many sites, on average, may have been responsible for the functional divergence between duplicate genes. In this article, we studied two basic types of functional divergence (type-I and type-II) in depth in order to give an accurate estimation of functional divergence-related sites. Type-I divergences result from altered functional constraints (i.e., different evolutionary rates) between duplicate genes, whereas type-II divergences refer to residues that are conserved by functional constraints but exhibit different physicochemical properties (e.g., charge or hydrophobicity) between duplicates. An effective site number (N) strategy was applied in our study, which implements a stepwise regression model to calculate the minimum number of residues responsible for functional divergence without choosing preset threshold. We found that N-determined cut-off value varies among different duplicate pairs, suggesting that empirical cutoff value is not suitable for every case. Under our standard N calculation method, we estimated less than 15% of residues that are required for paralogous gene functional divergence. Finally, we established a database, DIVERGE-D, as a public resource for the predicted N sites between two paralogs in this study, which can be used as candidates for further biological engineering and experimentation.
一直以来,热门的研究焦点之一是预测基因复制后功能分化所涉及的氨基酸残基,因为这些预测位点可作为进一步功能实验的候选对象。了解平均有多少位点可能导致了复制基因之间的功能分化,这一点既重要又有趣。在本文中,我们深入研究了两种基本类型的功能分化(I型和II型),以便准确估计与功能分化相关的位点。I型分化源于复制基因之间功能限制的改变(即不同的进化速率),而II型分化指的是那些受功能限制而保守,但在复制基因之间表现出不同物理化学性质(如电荷或疏水性)的残基。我们的研究采用了有效位点数(N)策略,该策略实施逐步回归模型来计算导致功能分化的最小残基数,而无需选择预设阈值。我们发现,由N确定的截止值在不同的复制基因对之间有所不同,这表明经验截止值并不适用于所有情况。在我们的标准N计算方法下,我们估计旁系同源基因功能分化所需的残基不到15%。最后,我们建立了一个数据库DIVERGE-D,作为本研究中两个旁系同源物之间预测的N位点的公共资源,可作为进一步生物工程和实验的候选对象。