Xiang Z, Moore K, Wood V, Rajandream M A, Barrell B G, Skelton J, Churcher C M, Lyne M H, Devlin K, Gwilliam R, Rutherford K M, Aves S J
School of Biological Sciences, University of Exeter, Washington Singer Laboratories, Perry Road, Exeter EX4 4QG, UK.
Yeast. 2000 Nov;16(15):1405-11. doi: 10.1002/1097-0061(200011)16:15<1405::AID-YEA625>3.0.CO;2-H.
One hundred and fourteen kilobase pairs (kb) of contiguous genomic sequence have been determined immediately distal to the his5 genetic marker located about 0.9 Mb from the centromere on the long arm of Schizosaccharomyces pombe chromosome 2. The sequence is contained in overlapping cosmid clones c16H5, c12D12, c24C6 and c19G7, of which 20 kb are identical to previously reported sequence from clone c21H7. The remaining 93 781 bp of sequence contains 10 known genes (cdc14, cdm1, cps1, gpa1, msh2, pck2, rip1, rps30-2, sad1 and ubl1), 32 open reading frames (ORFs) capable of coding for proteins of at least 100 amino acid residues in length, one 5S rRNA gene, one tRNA(Pro) gene, one lone Tf1-type long terminal repeat (LTR) and one lone Tf2-type LTR. There is a density of one protein-coding gene per 2.2 kb and 22 of the 42 ORFs (52%) incorporate one or more introns. Twenty-one of the novel ORFs show sequence similarities which suggest functions of their products, including a cyclin C, a MADS box transcription factor, mad2-like protein, telomere binding protein, topoisomerase II-associated protein, ATP-dependent DEAH box RNA helicase, G10 protein, ubiquitin-activating e1-like enzyme, nucleoporin, prolyl-tRNA synthetase, peptidylprolyl isomerase, delta-1-pyrroline-5-carboxylate dehydrogenase, protein transport protein, coatomer epsilon, TCP-1 chaperonin, beta-subunit of 6-phosphofructokinase, aminodeoxychorismate lyase, a phosphate transport protein and a thioredoxin.
已确定了114千碱基对(kb)的连续基因组序列,该序列紧邻粟酒裂殖酵母2号染色体长臂上距着丝粒约0.9兆碱基(Mb)的his5遗传标记的远端。该序列包含在重叠的黏粒克隆c16H5、c12D12、c24C6和c19G7中,其中20 kb与先前报道的克隆c21H7的序列相同。其余93781 bp的序列包含10个已知基因(cdc14、cdm1、cps1、gpa1、msh2、pck2、rip1、rps30 - 2、sad1和ubl1)、32个能够编码长度至少为100个氨基酸残基的蛋白质的开放阅读框(ORF)、一个5S rRNA基因、一个tRNA(Pro)基因、一个单独的Tf1型长末端重复序列(LTR)和一个单独的Tf2型LTR。蛋白质编码基因的密度为每2.2 kb一个,42个ORF中有22个(52%)包含一个或多个内含子。21个新的ORF显示出序列相似性,这表明其产物的功能,包括一个细胞周期蛋白C、一个MADS盒转录因子、mad2样蛋白、端粒结合蛋白、拓扑异构酶II相关蛋白、ATP依赖性DEAH盒RNA解旋酶、G10蛋白、泛素激活e1样酶、核孔蛋白、脯氨酰 - tRNA合成酶、肽基脯氨酰异构酶、δ-1-吡咯啉-5-羧酸脱氢酶、蛋白质转运蛋白、外套蛋白ε、TCP-1伴侣蛋白、6-磷酸果糖激酶的β亚基、氨基脱氧分支酸裂解酶、一个磷酸盐转运蛋白和一个硫氧还蛋白。