Suppr超能文献

14种丙型肝炎病毒基因型核心基因的序列分析

Sequence analysis of the core gene of 14 hepatitis C virus genotypes.

作者信息

Bukh J, Purcell R H, Miller R H

机构信息

Hepatitis Viruses Section, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892.

出版信息

Proc Natl Acad Sci U S A. 1994 Aug 16;91(17):8239-43. doi: 10.1073/pnas.91.17.8239.

Abstract

We previously sequenced the 5' noncoding region of 44 isolates of hepatitis C virus (HCV), as well as the envelope 1 (E1) gene of 51 HCV isolates, and provided evidence for the existence of at least 6 major genetic groups consisting of at least 12 minor genotypes of HCV (i.e., genotypes I/1a, II/1b, III/2a, IV/2b, 2c, V/3a, 4a-4d, 5a, and 6a). We now report the complete nucleotide sequence of the putative core (C) gene of 52 HCV isolates that represent all of these 12 genotypes as well as two additional genotypes provisionally designated 4e and 4f that we identified in this study. The phylogenetic analysis of the C gene sequences was in agreement with that of the E1 gene sequences. A major division in the genetic distance was observed between HCV isolates of genotype 2 and those of the other genotypes in analysis of both the E1 and C genes. The C gene sequences of 9 genotypes have not been reported previously (i.e., genotypes 2c, 4a-4f, 5a, and 6a). Our analysis indicates that the C gene-based methods currently used to determine the HCV genotype, such as PCR with genotype-specific primers, should be revised in light of these data. We found that the predicted C gene was exactly 573 nt long in all 52 HCV isolates, with an N-terminal start codon and no in-frame stop codons. The nucleotide and predicted amino acid identities of the C gene sequences were in the range of 79.4-99.0% and 85.3-100%, respectively. Furthermore, we mapped universally conserved, as well as genotype-specific, nucleotide and deduced amino acid sequences of the C gene. The predicted C proteins of the different HCV genotypes shared the following features: (i) high content of proline residues, (ii) high content of arginine and lysine residues located primarily in three domains with 10 such residues invariant at positions 39-62, (iii) a cluster of 5 conserved tryptophan residues, (iv) two nuclear localization signals and a DNA-binding motif, (v) a potential phosphorylation site with a serine-proline motif, and (vi) three conserved hydrophilic domains that have been shown by others to contain immunogenic epitopes. Thus, we have extended analysis of the predicted C protein of HCV to all of the recognized genotypes, confirmed the existence of highly conserved regions of this important structural protein, and demonstrated that the genetic relatedness of HCV isolates is equivalent when analyzing the most conserved (i.e., C) and the most variable (i.e., E1) genes of the HCV genome.

摘要

我们之前对44株丙型肝炎病毒(HCV)的5'非编码区以及51株HCV的包膜1(E1)基因进行了测序,并提供证据表明存在至少6个主要遗传组,由至少12种HCV的次要基因型组成(即基因型I/1a、II/1b、III/2a、IV/2b、2c、V/3a、4a - 4d、5a和6a)。我们现在报告52株HCV的推定核心(C)基因的完整核苷酸序列,这些序列代表了所有这12种基因型以及我们在本研究中鉴定出的另外两种暂时命名为4e和4f的基因型。C基因序列的系统发育分析与E1基因序列的分析结果一致。在E1和C基因分析中,观察到2型HCV分离株与其他基因型的HCV分离株在遗传距离上存在主要差异。9种基因型的C基因序列此前尚未见报道(即基因型2c、4a - 4f、5a和6a)。我们的分析表明,目前用于确定HCV基因型的基于C基因的方法,如使用基因型特异性引物的PCR,应根据这些数据进行修订。我们发现,在所有52株HCV分离株中,预测的C基因长度恰好为573个核苷酸,有一个N端起始密码子且无框内终止密码子。C基因序列的核苷酸和预测氨基酸同一性分别在79.4 - 99.0%和85.3 - 100%的范围内。此外,我们绘制了C基因的普遍保守以及基因型特异性的核苷酸和推导氨基酸序列。不同HCV基因型的预测C蛋白具有以下共同特征:(i)脯氨酸残基含量高;(ii)精氨酸和赖氨酸残基含量高,主要位于三个结构域,其中10个此类残基在39 - 62位不变;(iii)一簇5个保守的色氨酸残基;(iv)两个核定位信号和一个DNA结合基序;(v)一个具有丝氨酸 - 脯氨酸基序的潜在磷酸化位点;(vi)三个保守的亲水区,其他人已证明这些区域含有免疫原性表位。因此,我们将HCV预测C蛋白的分析扩展到了所有已识别的基因型,证实了这种重要结构蛋白高度保守区域的存在,并证明在分析HCV基因组中最保守的(即C)和最可变的(即E1)基因时,HCV分离株的遗传相关性是等同的。

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验