Vlcek C, Paces V, Maltsev N, Paces J, Haselkorn R, Fonstein M
Institute of Molecular Genetics, Academy of Sciences of the Czech Republic, Flemingova 2, CZ-16637 Prague 6, Czech Republic.
Proc Natl Acad Sci U S A. 1997 Aug 19;94(17):9384-8. doi: 10.1073/pnas.94.17.9384.
Cosmids from the 1A3-1A10 region of the complete miniset were individually subcloned by using the vector M13 mp18. Sequences of each cosmid were assembled from about 400 DNA fragments generated from the ends of these phage subclones and merged into one 189-kb contig. About 160 ORFs identified by the CodonUse program were subjected to similarity searches. The biological functions of 80 ORFs could be assigned reliably by using the WIT and Magpie genome investigation tools. Eighty percent of these recognizable ORFs were organized in functional clusters, which simplified assignment decisions and increased the strength of the predictions. A set of 26 genes for cobalamin biosynthesis, genes for polyhydroxyalkanoic acid metabolism, DNA replication and recombination, and DNA gyrase were among those identified. Most of the ORFs lacking significant similarity with reference databases also were grouped. There are two large clusters of these ORFs, one located between 45 and 67 kb of the map, and the other between 150 and 183 kb. Nine of the loosely identified ORFs (of 15) of the first of these clusters match ORFs from phages or transposons. The other cluster also has four ORFs of possible phage origin.
完整微集的1A3 - 1A10区域的黏粒通过使用载体M13 mp18进行个体亚克隆。每个黏粒的序列由这些噬菌体亚克隆末端产生的约400个DNA片段组装而成,并合并成一个189 kb的重叠群。通过密码子使用程序鉴定的约160个开放阅读框(ORF)进行了相似性搜索。使用WIT和喜鹊基因组研究工具可以可靠地确定80个ORF的生物学功能。这些可识别的ORF中有80%被组织成功能簇,这简化了功能分配决策并增强了预测的可信度。一组26个钴胺素生物合成基因、聚羟基链烷酸代谢基因、DNA复制和重组基因以及DNA促旋酶基因就在所鉴定的基因之中。大多数与参考数据库缺乏显著相似性的ORF也被分组。这些ORF有两个大簇,一个位于图谱的45至67 kb之间,另一个位于150至183 kb之间。这些簇中第一个簇的15个松散鉴定的ORF中有9个与噬菌体或转座子的ORF匹配。另一个簇也有4个可能起源于噬菌体的ORF。