Kálmán M, Cserpán I, Bajszár G, Dobi A, Horváth E, Pázmán C, Simoncsits A
Institute of Genetics, Hungarian Academy of Sciences, Szeged.
Nucleic Acids Res. 1990 Oct 25;18(20):6075-81. doi: 10.1093/nar/18.20.6075.
A 1761 base pairs long artificial gene coding for human serum albumin (HSA) has been prepared by a newly developed synthetic approach, resulting in the largest synthetic gene so far described. Oligonucleotides corresponding to only one strand of the HSA gene were prepared by chemical synthesis, while the complementary strand was obtained by a combination of enzymatic and cloning steps. 24 synthetic, 69-85 nucleotides long oligonucleotides covering the major part of the HSA gene (41-1761 nucleotides) were used as building blocks. Generally, four groups of 6-6 such oligonucleotides were successively cloned in pUC19 Escherichia coli vector to obtain about quarters of the gene as large fragments. Joining of these four fragments resulted in a cloned DNA coding for the 13-585 amino acid region of HSA, which was further supplemented with a double-stranded linker sequence coding for the amino terminal 12 amino acids. The completed structural gene composed of frequently used codons in the highly expressed yeast genes was then supplied with yeast regulatory sequences and the HSA expression cassette so obtained was inserted into an Escherichia coli-Saccharomyces cerevisiae shuttle vector. This vector was shown to direct the expression in Saccharomyces cerevisiae of correctly processed, mature HSA which was recognized by antiserum to HSA, and possessed the correct N-terminal amino acid sequence.
通过一种新开发的合成方法制备了一个编码人血清白蛋白(HSA)的1761个碱基对长的人工基因,这是迄今为止所描述的最大的合成基因。通过化学合成制备了仅对应于HSA基因一条链的寡核苷酸,而互补链则通过酶促步骤和克隆步骤相结合获得。使用24个合成的、69 - 85个核苷酸长的寡核苷酸作为构建模块,这些寡核苷酸覆盖了HSA基因的主要部分(41 - 1761个核苷酸)。通常,将四组每组6 - 6个这样的寡核苷酸依次克隆到pUC19大肠杆菌载体中,以获得约四分之一的基因作为大片段。将这四个片段连接起来,得到一个编码HSA 13 - 585氨基酸区域的克隆DNA,该区域进一步补充了一个编码氨基末端12个氨基酸的双链接头序列。然后,由高表达酵母基因中常用密码子组成的完整结构基因被赋予酵母调控序列,并将得到的HSA表达盒插入到大肠杆菌 - 酿酒酵母穿梭载体中。该载体被证明能在酿酒酵母中指导正确加工的成熟HSA的表达,这种HSA能被抗HSA血清识别,并且具有正确的N末端氨基酸序列。