Konno H, Fukunishi Y, Shibata K, Itoh M, Carninci P, Sugahara Y, Hayashizaki Y
Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center, Yokohama 230-0045, Japan.
Genome Res. 2001 Feb;11(2):281-9. doi: 10.1101/gr.gr-1457r.
We developed computer-based methods for constructing a nonredundant mouse full-length cDNA library. Our cDNA library construction process comprises assessment of library quality, sequencing the 3' ends of inserts and clustering, and completing a re-array to generate a nonredundant library from a redundant one. After the cDNA libraries are generated, we sequence the 5' ends of the inserts to check the quality of the library; then we determine the sequencing priority of each library. Selected libraries undergo large-scale sequencing of the 3' ends of the inserts and clustering of the tag sequences. After clustering, the nonredundant library is constructed from the original libraries, which have redundant clones. All libraries, plates, clones, sequences, and clusters are uniquely identified, and all information is saved in the database according to this identifier. At press time, our system has been in place for the past two years; we have clustered 939,725 3' end sequences into 127,385 groups from 227 cDNA libraries/sublibraries (see http://genome.gse.riken.go.jp/).
我们开发了基于计算机的方法来构建非冗余小鼠全长cDNA文库。我们的cDNA文库构建过程包括评估文库质量、对插入片段的3'末端进行测序和聚类,以及完成重新排列以从冗余文库生成非冗余文库。在生成cDNA文库后,我们对插入片段的5'末端进行测序以检查文库质量;然后我们确定每个文库的测序优先级。选定的文库进行插入片段3'末端的大规模测序和标签序列的聚类。聚类后,从具有冗余克隆的原始文库构建非冗余文库。所有文库、平板、克隆、序列和聚类都有唯一标识,并且所有信息都根据此标识符保存在数据库中。截至发稿时,我们的系统已经运行了两年;我们已将来自227个cDNA文库/子文库的939,725个3'末端序列聚类为127,385组(见http://genome.gse.riken.go.jp/)。