Zhang Chi-Yu, Wei Ji-Fu, He Shao-Heng
Department of Biochemistry and Molecular Biology, Jiangsu University School of Medical Technology, Zhenjiang, Jiangsu 212001, China.
BMC Microbiol. 2006 Oct 4;6:88. doi: 10.1186/1471-2180-6-88.
It is believed that animal-to-human transmission of severe acute respiratory syndrome (SARS) coronavirus (CoV) is the cause of the SARS outbreak worldwide. The spike (S) protein is one of the best characterized proteins of SARS-CoV, which plays a key role in SARS-CoV overcoming species barrier and accomplishing interspecies transmission from animals to humans, suggesting that it may be the major target of selective pressure. However, the process of adaptive evolution of S protein and the exact positively selected sites associated with this process remain unknown.
By investigating the adaptive evolution of S protein, we identified twelve amino acid sites (75, 239, 244, 311, 479, 609, 613, 743, 765, 778, 1148, and 1163) in the S protein under positive selective pressure. Based on phylogenetic tree and epidemiological investigation, SARS outbreak was divided into three epidemic groups: 02-04 interspecies, 03-early-mid, and 03-late epidemic groups in the present study. Positive selection was detected in the first two groups, which represent the course of SARS-CoV interspecies transmission and of viral adaptation to human host, respectively. In contrast, purifying selection was detected in 03-late group. These indicate that S protein experiences variable positive selective pressures before reaching stabilization. A total of 25 sites in 02-04 interspecies epidemic group and 16 sites in 03-early-mid epidemic group were identified under positive selection. The identified sites were different between these two groups except for site 239, which suggests that positively selected sites are changeable between groups. Moreover, it was showed that a larger proportion (24%) of positively selected sites was located in receptor-binding domain (RBD) than in heptad repeat (HR)1-HR2 region in 02-04 interspecies epidemic group (p = 0.0208), and a greater percentage (25%) of these sites occurred in HR1-HR2 region than in RBD in 03-early-mid epidemic group (p = 0.0721). These suggest that functionally different domains of S protein may not experience same positive selection in each epidemic group. In addition, three specific replacements (F360S, T487S and L665S) were only found between 03-human SARS-CoVs and strains from 02-04 interspecies epidemic group, which reveals that selective sweep may also force the evolution of S genes before the jump of SARS-CoVs into human hosts. Since certain residues at these positively selected sites are associated with receptor recognition and/or membrane fusion, they are likely to be the crucial residues for animal-to-human transmission of SARS-CoVs, and subsequent adaptation to human hosts.
The variation of positive selective pressures and positively selected sites are likely to contribute to the adaptive evolution of S protein from animals to humans.
人们认为严重急性呼吸综合征(SARS)冠状病毒(CoV)从动物传播给人类是全球SARS疫情爆发的原因。刺突(S)蛋白是SARS-CoV中特征最明确的蛋白之一,它在SARS-CoV跨越物种屏障并实现从动物到人类的跨物种传播中起关键作用,这表明它可能是选择压力的主要靶点。然而,S蛋白的适应性进化过程以及与该过程相关的确切正选择位点仍然未知。
通过研究S蛋白的适应性进化,我们确定了S蛋白中处于正选择压力下的12个氨基酸位点(75、239、244、311、479、609、613、743、765、778、1148和1163)。基于系统发育树和流行病学调查,本研究将SARS疫情分为三个流行组:02 - 04跨物种组、03 - 早期 - 中期组和03 - 晚期组。在前两组中检测到正选择,分别代表SARS-CoV跨物种传播过程和病毒对人类宿主的适应过程。相比之下,在03 - 晚期组中检测到纯化选择。这些表明S蛋白在达到稳定之前经历了可变的正选择压力。在02 - 04跨物种流行组中确定了25个正选择位点,在03 - 早期 - 中期流行组中确定了16个正选择位点。除了位点239外,这两组中确定的位点不同,这表明正选择位点在不同组之间是可变的。此外,结果表明,在02 - 04跨物种流行组中,正选择位点中位于受体结合域(RBD)的比例(24%)大于位于七肽重复序列(HR)1 - HR2区域的比例(p = 0.0208),而在03 - 早期 - 中期流行组中,这些位点出现在HR1 - HR2区域的百分比(25%)大于出现在RBD中的百分比(p = 0.0721)。这些表明S蛋白功能不同的结构域在每个流行组中可能不会经历相同的正选择。此外,仅在03 - 人类SARS-CoV与02 - 04跨物种流行组的毒株之间发现了三个特定的替换(F360S、T487S和L665S),这表明在SARS-CoV跳入人类宿主之前,选择性清除也可能推动S基因的进化。由于这些正选择位点的某些残基与受体识别和/或膜融合有关,它们可能是SARS-CoV从动物传播给人类以及随后适应人类宿主的关键残基。
正选择压力和正选择位点的变化可能有助于S蛋白从动物到人类的适应性进化。