Wahl R, Rice P, Rice C M, Kröger M
Institut für Mikrobiologie und Molekularbiologie, Fachbereich Biologie, Justus-Liebig-Universtät Giessen, Germany.
Nucleic Acids Res. 1994 Sep;22(17):3450-5. doi: 10.1093/nar/22.17.3450.
We have compiled the DNA sequence data for E. coli available from the GENBANK and EMBL data libraries and independently from the literature. Starting with this update of our Escherichia coli database (ECD release 20) we provide major changes compared to previous issues. This update not only represents another substantial increase in sequence information, it also allows now to find the exact physical location of each individual gene or regulatory region, even regarding discrepancies in nomenclature. In order to save space this printed version does not contain the database itself anymore, but we provide several examples. The complete database is publically available in electronic form together with a self explaining application program or as a flat file. The complete compilation including a full set of genetic map data and the E. coli protein index can be obtained in machine readable form from the EMBL data library as a part of the CD-ROM issue of the EMBL sequence database, released and updated every three months. After deletion of all detected overlaps a total of 2,878,364 individual bp is found to be determined till the end of June 1994. This corresponds to a total of 60.98% of the entire E. coli chromosome consisting of about 4,720 kbp. This number may actually be higher by 9161 bp derived from other strains of E. coli.
我们已经收集了来自GENBANK和EMBL数据库以及独立于文献的大肠杆菌DNA序列数据。从我们的大肠杆菌数据库(ECD版本20)的这次更新开始,与之前的版本相比有了重大变化。这次更新不仅意味着序列信息又有了大幅增加,现在还能够找到每个单独基因或调控区域的确切物理位置,甚至可以解决命名上的差异。为了节省空间,这个印刷版本不再包含数据库本身,但我们提供了几个示例。完整的数据库以电子形式公开提供,同时还有一个自解释的应用程序或作为一个平面文件。完整的汇编,包括全套遗传图谱数据和大肠杆菌蛋白质索引,可以从EMBL数据库以机器可读形式作为EMBL序列数据库CD-ROM版本的一部分获得,该版本每三个月发布和更新一次。在删除所有检测到的重叠部分后,到1994年6月底共确定了2,878,364个单独的碱基对。这相当于由约4,720 kbp组成的整个大肠杆菌染色体的60.98%。这个数字实际上可能因来自其他大肠杆菌菌株的9161个碱基对而更高。