TriFLDB：一个来自小麦族的聚类全长编码序列数据库及其在禾本科植物比较基因组学中的应用

TriFLDB: a database of clustered full-length coding sequences from Triticeae with applications to comparative grass genomics.

作者信息

Mochida Keiichi, Yoshida Takuhiro, Sakurai Tetsuya, Ogihara Yasunari, Shinozaki Kazuo

机构信息

Plant Science Center, RIKEN, Yokohama 230-0045, Japan.

出版信息

Plant Physiol. 2009 Jul;150(3):1135-46. doi: 10.1104/pp.109.138214. Epub 2009 May 15.

DOI:10.1104/pp.109.138214

PMID:19448038

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2705016/

Abstract

The Triticeae Full-Length CDS Database (TriFLDB) contains available information regarding full-length coding sequences (CDSs) of the Triticeae crops wheat (Triticum aestivum) and barley (Hordeum vulgare) and includes functional annotations and comparative genomics features. TriFLDB provides a search interface using keywords for gene function and related Gene Ontology terms and a similarity search for DNA and deduced translated amino acid sequences to access annotations of Triticeae full-length CDS (TriFLCDS) entries. Annotations consist of similarity search results against several sequence databases and domain structure predictions by InterProScan. The deduced amino acid sequences in TriFLDB are grouped with the proteome datasets for Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and sorghum (Sorghum bicolor) by hierarchical clustering in stepwise thresholds of sequence identity, providing hierarchical clustering results based on full-length protein sequences. The database also provides sequence similarity results based on comparative mapping of TriFLCDSs onto the rice and sorghum genome sequences, which together with current annotations can be used to predict gene structures for TriFLCDS entries. To provide the possible genetic locations of full-length CDSs, TriFLCDS entries are also assigned to the genetically mapped cDNA sequences of barley and diploid wheat, which are currently accommodated in the Triticeae Mapped EST Database. These relational data are searchable from the search interfaces of both databases. The current TriFLDB contains 15,871 full-length CDSs from barley and wheat and includes putative full-length cDNAs for barley and wheat, which are publicly accessible. This informative content provides an informatics gateway for Triticeae genomics and grass comparative genomics. TriFLDB is publicly available at http://TriFLDB.psc.riken.jp/.

摘要

小麦族全长CDS数据库（TriFLDB）包含有关小麦族作物小麦（Triticum aestivum）和大麦（Hordeum vulgare）全长编码序列（CDS）的可用信息，包括功能注释和比较基因组学特征。TriFLDB提供了一个使用基因功能关键词和相关基因本体术语的搜索界面，以及对DNA和推导的翻译氨基酸序列的相似性搜索，以访问小麦族全长CDS（TriFLCDS）条目的注释。注释包括针对多个序列数据库的相似性搜索结果以及通过InterProScan进行的结构域结构预测。TriFLDB中推导的氨基酸序列通过序列同一性的逐步阈值进行层次聚类，与拟南芥（Arabidopsis thaliana）、水稻（Oryza sativa）和高粱（Sorghum bicolor）的蛋白质组数据集分组，提供基于全长蛋白质序列的层次聚类结果。该数据库还基于TriFLCDS与水稻和高粱基因组序列的比较图谱提供序列相似性结果，这些结果与当前注释一起可用于预测TriFLCDS条目的基因结构。为了提供全长CDS的可能遗传位置，TriFLCDS条目还被分配到目前保存在小麦族定位EST数据库中的大麦和二倍体小麦的遗传定位cDNA序列。这些相关数据可从两个数据库的搜索界面进行搜索。当前的TriFLDB包含来自大麦和小麦的15,871个全长CDS，包括大麦和小麦的推定全长cDNA，可公开获取。这些信息内容为小麦族基因组学和禾本科比较基因组学提供了一个信息学网关。TriFLDB可在http://TriFLDB.psc.riken.jp/上公开获取。

相似文献

TriFLDB: a database of clustered full-length coding sequences from Triticeae with applications to comparative grass genomics.

Plant Physiol. 2009 Jul;150(3):1135-46. doi: 10.1104/pp.109.138214. Epub 2009 May 15.

TriMEDB: a database to integrate transcribed markers and facilitate genetic studies of the tribe Triticeae.

BMC Plant Biol. 2008 Jun 30;8:72. doi: 10.1186/1471-2229-8-72.

Gramene, a tool for grass genomics.

Plant Physiol. 2002 Dec;130(4):1606-13. doi: 10.1104/pp.015248.

DNA sequence-based mapping and comparative genomics of the genome of (Pursh) Á. Löve versus wheat ( L.) and barley ( L.).

Genome. 2020 Sep;63(9):445-457. doi: 10.1139/gen-2019-0152. Epub 2020 May 8.

Molecular, phylogenetic and comparative genomic analysis of the cytokinin oxidase/dehydrogenase gene family in the Poaceae.

Plant Biotechnol J. 2012 Jan;10(1):67-82. doi: 10.1111/j.1467-7652.2011.00645.x. Epub 2011 Aug 15.

Physical mapping of a large plant genome using global high-information-content-fingerprinting: the distal region of the wheat ancestor Aegilops tauschii chromosome 3DS.

BMC Genomics. 2010 Jun 17;11:382. doi: 10.1186/1471-2164-11-382.

Genome-wide analysis of the rice and Arabidopsis non-specific lipid transfer protein (nsLtp) gene families and identification of wheat nsLtp genes by EST data mining.

BMC Genomics. 2008 Feb 21;9:86. doi: 10.1186/1471-2164-9-86.

Frequent gene movement and pseudogene evolution is common to the large and complex genomes of wheat, barley, and their relatives.

Plant Cell. 2011 May;23(5):1706-18. doi: 10.1105/tpc.111.086629. Epub 2011 May 27.

High gene density is conserved at syntenic loci of small and large grass genomes.

Proc Natl Acad Sci U S A. 1999 Jul 6;96(14):8265-70. doi: 10.1073/pnas.96.14.8265.

CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L.) methylation filtered genomic genespace sequences.

BMC Bioinformatics. 2007 Apr 19;8:129. doi: 10.1186/1471-2105-8-129.

引用本文的文献

A telomere-to-telomere genome assembly coupled with multi-omic data provides insights into the evolution of hexaploid bread wheat.

Nat Genet. 2025 Apr;57(4):1008-1020. doi: 10.1038/s41588-025-02137-x. Epub 2025 Apr 7.

Regulation of Seed Dormancy Genes in Triticeae Species.

Methods Mol Biol. 2024;2830:13-23. doi: 10.1007/978-1-0716-3965-8_2.

TRITEX: chromosome-scale sequence assembly of Triticeae genomes with open-source tools.

Genome Biol. 2019 Dec 18;20(1):284. doi: 10.1186/s13059-019-1899-5.

Genomic tools for durum wheat breeding: de novo assembly of Svevo transcriptome and SNP discovery in elite germplasm.

BMC Genomics. 2019 Apr 10;20(1):278. doi: 10.1186/s12864-019-5645-x.

An efficient approach for the development of genome-specific markers in allohexaploid wheat (Triticum aestivum L.) and its application in the construction of high-density linkage maps of the D genome.

DNA Res. 2018 Feb 21;25(3):317-26. doi: 10.1093/dnares/dsy004.

Genome sequence of the progenitor of the wheat D genome Aegilops tauschii.

Nature. 2017 Nov 23;551(7681):498-502. doi: 10.1038/nature24486. Epub 2017 Nov 15.

Bioinformatic Analyses of Subgroup-A Members of the Wheat bZIP Transcription Factor Family and Functional Identification of Involved in Drought Stress Response.

Front Plant Sci. 2016 Nov 16;7:1643. doi: 10.3389/fpls.2016.01643. eCollection 2016.

Gene Overexpression Resources in Cereals for Functional Genomics and Discovery of Useful Genes.

Front Plant Sci. 2016 Sep 21;7:1359. doi: 10.3389/fpls.2016.01359. eCollection 2016.

Fine mapping of the stem rust resistance gene SrTA10187.

Theor Appl Genet. 2016 Dec;129(12):2369-2378. doi: 10.1007/s00122-016-2776-1. Epub 2016 Aug 31.

Extensive sequence divergence between the reference genomes of two elite indica rice varieties Zhenshan 97 and Minghui 63.

Proc Natl Acad Sci U S A. 2016 Aug 30;113(35):E5163-71. doi: 10.1073/pnas.1611012113. Epub 2016 Aug 17.

本文引用的文献

Functional genomics using RIKEN Arabidopsis thaliana full-length cDNAs.

J Plant Res. 2009 Jul;122(4):355-66. doi: 10.1007/s10265-009-0239-3. Epub 2009 May 2.

The Sorghum bicolor genome and the diversification of grasses.

Nature. 2009 Jan 29;457(7229):551-6. doi: 10.1038/nature07723.

Development of 5006 full-length CDNAs in barley: a tool for accessing cereal genomics resources.

DNA Res. 2009 Apr;16(2):81-9. doi: 10.1093/dnares/dsn034. Epub 2009 Jan 15.

The international barley sequencing consortium--at the threshold of efficient access to the barley genome.

Plant Physiol. 2009 Jan;149(1):142-7. doi: 10.1104/pp.108.128967.

Genomic and genetic database resources for the grasses.

Plant Physiol. 2009 Jan;149(1):132-6. doi: 10.1104/pp.108.129593.

Insights into corn genes derived from large-scale cDNA sequencing.

Plant Mol Biol. 2009 Jan;69(1-2):179-94. doi: 10.1007/s11103-008-9415-4. Epub 2008 Oct 21.

A conifer genomics resource of 200,000 spruce (Picea spp.) ESTs and 6,464 high-quality, sequence-finished full-length cDNAs for Sitka spruce (Picea sitchensis).

BMC Genomics. 2008 Oct 14;9:484. doi: 10.1186/1471-2164-9-484.

A physical map of the 1-gigabase bread wheat chromosome 3B.

Science. 2008 Oct 3;322(5898):101-4. doi: 10.1126/science.1161847.

TriMEDB: a database to integrate transcribed markers and facilitate genetic studies of the tribe Triticeae.

BMC Plant Biol. 2008 Jun 30;8:72. doi: 10.1186/1471-2229-8-72.

Genomics of sorghum.

Int J Plant Genomics. 2008;2008:362451. doi: 10.1155/2008/362451.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

TriFLDB：一个来自小麦族的聚类全长编码序列数据库及其在禾本科植物比较基因组学中的应用

TriFLDB: a database of clustered full-length coding sequences from Triticeae with applications to comparative grass genomics.

作者信息

Mochida Keiichi, Yoshida Takuhiro, Sakurai Tetsuya, Ogihara Yasunari, Shinozaki Kazuo

机构信息

Plant Science Center, RIKEN, Yokohama 230-0045, Japan.

出版信息

Plant Physiol. 2009 Jul;150(3):1135-46. doi: 10.1104/pp.109.138214. Epub 2009 May 15.

DOI:10.1104/pp.109.138214

PMID:19448038

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2705016/

Abstract

摘要

TriFLDB：一个来自小麦族的聚类全长编码序列数据库及其在禾本科植物比较基因组学中的应用

TriFLDB: a database of clustered full-length coding sequences from Triticeae with applications to comparative grass genomics.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

TriFLDB：一个来自小麦族的聚类全长编码序列数据库及其在禾本科植物比较基因组学中的应用

TriFLDB: a database of clustered full-length coding sequences from Triticeae with applications to comparative grass genomics.

作者信息

机构信息

出版信息