Cui Jike, Smith Temple, Robbins Phillips W, Samuelson John
Department of Molecular and Cell Biology, Boston University Goldman School of Dental Medicine, Boston, MA 02118, USA.
Proc Natl Acad Sci U S A. 2009 Aug 11;106(32):13421-6. doi: 10.1073/pnas.0905818106. Epub 2009 Jul 28.
Numerous protists and rare fungi have truncated Asn-linked glycan precursors and lack N-glycan-dependent quality control (QC) systems for glycoprotein folding in the endoplasmic reticulum. Here, we show that the abundance of sequons (NXT or NXS), which are sites for N-glycosylation of secreted and membrane proteins, varies by more than a factor of 4 among phylogenetically diverse eukaryotes, based on a few variables. There is positive correlation between the density of sequons and the AT content of coding regions, although no causality can be inferred. In contrast, there appears to be Darwinian selection for sequons containing Thr, but not Ser, in eukaryotes that have N-glycan-dependent QC systems. Selection for sequons with Thr, which nearly doubles the sequon density in human secreted and membrane proteins, occurs by an increased conditional probability that Asn and Thr are present in sequons rather than elsewhere. Increasing sequon densities of the hemagglutinin (HA) of influenza viruses A/H3N2 and A/H1N1 during the past few decades of human infection also result from an increased conditional probability that Asn, Thr, and Ser are present in sequons rather than elsewhere. In contrast, there is no selection on sequons by this mechanism in HA of A/H5N1 or 2009 A/H1N1 (Swine flu). Very strong selection for sequons with both Thr and Ser in glycoprotein of M(r) 120,000 (gp120) of HIV and related retroviruses results from this same mechanism, as well as amino acid composition bias and increases in AT content. We conclude that there is Darwinian selection for sequons in phylogenetically disparate eukaryotes and viruses.
许多原生生物和罕见真菌具有截短的天冬酰胺连接聚糖前体,并且在内质网中缺乏用于糖蛋白折叠的N - 聚糖依赖性质量控制(QC)系统。在此,我们表明,基于几个变量,分泌蛋白和膜蛋白的N - 糖基化位点(NXT或NXS)的丰度在系统发育上不同的真核生物中变化超过4倍。序列位点的密度与编码区的AT含量之间存在正相关,尽管无法推断因果关系。相比之下,在具有N - 聚糖依赖性QC系统的真核生物中,似乎存在对含有苏氨酸而非丝氨酸的序列位点的达尔文选择。对含有苏氨酸的序列位点的选择,使人类分泌蛋白和膜蛋白中的序列位点密度几乎增加了一倍,这是由于序列位点中存在天冬酰胺和苏氨酸而非其他地方的条件概率增加所致。在过去几十年人类感染期间,甲型流感病毒A / H3N2和A / H1N1血凝素(HA)的序列位点密度增加,也是由于序列位点中存在天冬酰胺、苏氨酸和丝氨酸而非其他地方的条件概率增加所致。相比之下,在A / H5N1或2009年甲型H1N1流感病毒(猪流感)的HA中,不存在通过这种机制对序列位点的选择。HIV和相关逆转录病毒的120,000分子量糖蛋白(gp120)中,对同时含有苏氨酸和丝氨酸的序列位点有非常强烈的选择,这是由相同机制以及氨基酸组成偏差和AT含量增加导致的。我们得出结论,在系统发育不同的真核生物和病毒中存在对序列位点的达尔文选择。