Sato Kazuhiro, Shin-I Tadasu, Seki Motoaki, Shinozaki Kazuo, Yoshida Hideya, Takeda Kazuyoshi, Yamazaki Yukiko, Conte Matthieu, Kohara Yuji
Research Institute for Bioresources, Okayama University, Kurashiki, Japan.
DNA Res. 2009 Apr;16(2):81-9. doi: 10.1093/dnares/dsn034. Epub 2009 Jan 15.
A collection of 5006 full-length (FL) cDNA sequences was developed in barley. Fifteen mRNA samples from various organs and treatments were pooled to develop a cDNA library using the CAP trapper method. More than 60% of the clones were confirmed to have complete coding sequences, based on comparison with rice amino acid and UniProt sequences. Blastn homologies (E<1E-5) to rice genes and Arabidopsis genes were 89 and 47%, respectively. Of the 5028 possible amino acid sequences derived from the 5006 FLcDNAs, 4032 (80.2%) were classified into 1678 GreenPhyl multigenic families. There were 555 cDNAs showing low homology to both rice and Arabidopsis. Gene ontology annotation by InterProScan indicated that many of these cDNAs (71%) have no known molecular functions and may be unique to barley. The cDNAs showed high homology to Barley 1 GeneChip oligo probes (81%) and the wheat gene index (84%). The high homology between FLcDNAs (27%) and mapped barley expressed sequence tag enabled assigning linkage map positions to 151-233 FLcDNAs on each of the seven barley chromosomes. These comprehensive barley FLcDNAs provide strong platform to connect pre-existing genomic and genetic resources and accelerate gene identification and genome analysis in barley and related species. Sequence data from this article have been deposited with the DDBJ/EMBL/GenBank Data Libraries under accession nos AK248134-AK253139. The online database with annotation is available at http://www.shigen.nig.ac.jp/barley/.
在大麦中构建了一个包含5006个全长(FL)cDNA序列的文库。将来自不同器官和处理的15个mRNA样本混合,采用CAP捕获法构建cDNA文库。与水稻氨基酸序列和UniProt序列比较后,超过60%的克隆被确认为具有完整的编码序列。与水稻基因和拟南芥基因的Blastn同源性(E<1E-5)分别为89%和47%。从5006个FL cDNA衍生的5028个可能的氨基酸序列中,4032个(80.2%)被归类到1678个GreenPhyl多基因家族中。有555个cDNA与水稻和拟南芥的同源性都很低。InterProScan进行的基因本体注释表明,这些cDNA中的许多(71%)没有已知的分子功能,可能是大麦特有的。这些cDNA与大麦1号基因芯片寡核苷酸探针的同源性很高(81%),与小麦基因索引的同源性也很高(84%)。FL cDNA之间的高度同源性(27%)以及已定位的大麦表达序列标签,使得能够在大麦的七条染色体上分别为151 - 233个FL cDNA确定连锁图谱位置。这些全面的大麦FL cDNA提供了一个强大的平台,用于连接现有的基因组和遗传资源,并加速大麦及相关物种的基因鉴定和基因组分析。本文的序列数据已提交至DDBJ/EMBL/GenBank数据库,登录号为AK248134 - AK253139。带有注释的在线数据库可在http://www.shigen.nig.ac.jp/barley/获取。