Chao Ya-Ting, Yen Shao-Hua, Yeh Jen-Hau, Chen Wan-Chieh, Shih Ming-Che
Agricultural Biotechnology Research Center, Academia Sinica, Nankang, Taipei, Taiwan.
Plant Cell Physiol. 2017 Jan 1;58(1):e9. doi: 10.1093/pcp/pcw220.
Orchidaceae, the orchid family, encompasses more than 25,000 species and five subfamilies. Due to their beautiful and exotic flowers, distinct biological and ecological features, orchids have aroused wide interest among both researchers and the general public. We constructed the Orchidstra database, a resource for orchid transcriptome assembly and gene annotations. The Orchistra database has been under active development since 2013. To accommodate the increasing amount of orchid transcriptome data and house more comprehensive information, Orchidstra 2.0 has been built with a new database system to store the annotations of 510,947 protein-coding genes and 161,826 noncoding transcripts, covering 18 orchid species belonging to 12 genera in five subfamilies of Orchidaceae. We have improved the N50 size of protein-coding genes, provided new functional annotations (including protein-coding gene annotations, protein domain/family information, pathways analysis, Gene Ontology term assignments, orthologous genes across orchid species, cross-links to the database of model species, and miRNA information), and improved the user interface with better website performance. We also provide new database functionalities for database searching and sequence retrieval. Moreover, the Orchidstra 2.0 database incorporates detailed RNA-Seq gene expression data from various tissues and developmental stages in different orchid species. The database will be useful for gene prediction and gene family studies, and for exploring gene expression in orchid species. The Orchidstra 2.0 database is freely accessible at http://orchidstra2.abrc.sinica.edu.tw.
兰科包含超过25000个物种和五个亚科。由于其美丽奇异的花朵以及独特的生物学和生态学特征,兰花引起了研究人员和普通大众的广泛关注。我们构建了Orchidstra数据库,这是一个用于兰花转录组组装和基因注释的资源库。自2013年以来,Orchidstra数据库一直在积极开发中。为了容纳不断增加的兰花转录组数据并存储更全面的信息,Orchidstra 2.0已采用新的数据库系统构建,用于存储510947个蛋白质编码基因和161826个非编码转录本的注释,涵盖兰科五个亚科中12个属的18种兰花。我们提高了蛋白质编码基因的N50大小,提供了新的功能注释(包括蛋白质编码基因注释、蛋白质结构域/家族信息、通路分析、基因本体术语分配、跨兰花物种的直系同源基因、与模式物种数据库的交联以及miRNA信息),并通过更好的网站性能改进了用户界面。我们还为数据库搜索和序列检索提供了新的数据库功能。此外,Orchidstra 2.0数据库纳入了来自不同兰花物种各种组织和发育阶段的详细RNA测序基因表达数据。该数据库将有助于基因预测和基因家族研究,以及探索兰花物种中的基因表达。可通过http://orchidstra2.abrc.sinica.edu.tw免费访问Orchidstra 2.0数据库。