ESTree数据库的版本VI：用于桃转录组分析的改进工具。

Version VI of the ESTree db: an improved tool for peach transcriptome analysis.

作者信息

Lazzari Barbara, Caprera Andrea, Vecchietti Alberto, Merelli Ivan, Barale Francesca, Milanesi Luciano, Stella Alessandra, Pozzi Carlo

机构信息

Parco Tecnologico Padano, Via Einstein - Località Cascina Codazza, Lodi, 26900, Italy.

出版信息

BMC Bioinformatics. 2008 Mar 26;9 Suppl 2(Suppl 2):S9. doi: 10.1186/1471-2105-9-S2-S9.

DOI:10.1186/1471-2105-9-S2-S9

PMID:18387211

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2323672/

Abstract

BACKGROUND

The ESTree database (db) is a collection of Prunus persica and Prunus dulcis EST sequences that in its current version encompasses 75,404 sequences from 3 almond and 19 peach libraries. Nine peach genotypes and four peach tissues are represented, from four fruit developmental stages. The aim of this work was to implement the already existing ESTree db by adding new sequences and analysis programs. Particular care was given to the implementation of the web interface, that allows querying each of the database features.

RESULTS

A Perl modular pipeline is the backbone of sequence analysis in the ESTree db project. Outputs obtained during the pipeline steps are automatically arrayed into the fields of a MySQL database. Apart from standard clustering and annotation analyses, version VI of the ESTree db encompasses new tools for tandem repeat identification, annotation against genomic Rosaceae sequences, and positioning on the database of oligomer sequences that were used in a peach microarray study. Furthermore, known protein patterns and motifs were identified by comparison to PROSITE. Based on data retrieved from sequence annotation against the UniProtKB database, a script was prepared to track positions of homologous hits on the GO tree and build statistics on the ontologies distribution in GO functional categories. EST mapping data were also integrated in the database. The PHP-based web interface was upgraded and extended. The aim of the authors was to enable querying the database according to all the biological aspects that can be investigated from the analysis of data available in the ESTree db. This is achieved by allowing multiple searches on logical subsets of sequences that represent different biological situations or features.

CONCLUSIONS

The version VI of ESTree db offers a broad overview on peach gene expression. Sequence analyses results contained in the database, extensively linked to external related resources, represent a large amount of information that can be queried via the tools offered in the web interface. Flexibility and modularity of the ESTree analysis pipeline and of the web interface allowed the authors to set up similar structures for different datasets, with limited manual intervention.

摘要

背景

ESTree数据库是一个包含桃和扁桃EST序列的集合，其当前版本包含来自3个扁桃文库和19个桃文库的75404条序列。涵盖了9个桃基因型和4个桃组织，来自4个果实发育阶段。这项工作的目的是通过添加新序列和分析程序来完善现有的ESTree数据库。特别关注了网络界面的实现，该界面允许查询数据库的每个功能。

结果

一个Perl模块化管道是ESTree数据库项目中序列分析的核心。在管道步骤中获得的输出会自动排列到MySQL数据库的字段中。除了标准的聚类和注释分析外，ESTree数据库的第六版还包括用于串联重复识别的新工具、针对蔷薇科基因组序列的注释以及在桃微阵列研究中使用的寡聚体序列在数据库中的定位。此外，通过与PROSITE进行比较，识别出了已知的蛋白质模式和基序。基于从针对UniProtKB数据库的序列注释中检索到的数据，编写了一个脚本，用于跟踪同源匹配在GO树中的位置，并建立GO功能类别中本体分布的统计信息。EST图谱数据也被整合到数据库中。基于PHP的网络界面得到了升级和扩展。作者的目的是能够根据从ESTree数据库中可用数据分析中可以研究的所有生物学方面来查询数据库。这是通过允许对代表不同生物学情况或特征的序列逻辑子集进行多次搜索来实现的。

结论

ESTree数据库的第六版提供了对桃基因表达的广泛概述。数据库中包含的序列分析结果与外部相关资源广泛链接，代表了大量可以通过网络界面提供的工具进行查询的信息。ESTree分析管道和网络界面的灵活性和模块化使作者能够在有限的人工干预下为不同的数据集建立类似的结构。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af76/2323672/085450d2ac66/1471-2105-9-S2-S9-1.jpg

相似文献

Version VI of the ESTree db: an improved tool for peach transcriptome analysis.

BMC Bioinformatics. 2008 Mar 26;9 Suppl 2(Suppl 2):S9. doi: 10.1186/1471-2105-9-S2-S9.

ESTree db: a tool for peach functional genomics.

BMC Bioinformatics. 2005 Dec 1;6 Suppl 4(Suppl 4):S16. doi: 10.1186/1471-2105-6-S4-S16.

ESTuber db: an online database for Tuber borchii EST sequences.

BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S13. doi: 10.1186/1471-2105-8-S1-S13.

GoSh: a web-based database for goat and sheep EST sequences.

Bioinformatics. 2007 Apr 15;23(8):1043-5. doi: 10.1093/bioinformatics/btm063. Epub 2007 Mar 24.

CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L.) methylation filtered genomic genespace sequences.

BMC Bioinformatics. 2007 Apr 19;8:129. doi: 10.1186/1471-2105-8-129.

annot8r: GO, EC and KEGG annotation of EST datasets.

BMC Bioinformatics. 2008 Apr 9;9:180. doi: 10.1186/1471-2105-9-180.

Candidate gene database and transcript map for peach, a model species for fruit trees.

Theor Appl Genet. 2005 May;110(8):1419-28. doi: 10.1007/s00122-005-1968-x. Epub 2005 Apr 22.

GeneTools--application for functional annotation and statistical hypothesis testing.

BMC Bioinformatics. 2006 Oct 24;7:470. doi: 10.1186/1471-2105-7-470.

GDR (Genome Database for Rosaceae): integrated web resources for Rosaceae genomics and genetics research.

BMC Bioinformatics. 2004 Sep 9;5:130. doi: 10.1186/1471-2105-5-130.

The Human EST Ontology Explorer: a tissue-oriented visualization system for ontologies distribution in human EST collections.

BMC Bioinformatics. 2009 Oct 15;10 Suppl 12(Suppl 12):S2. doi: 10.1186/1471-2105-10-S12-S2.

引用本文的文献

Application of Genomic Technologies to the Breeding of Trees.

Front Genet. 2016 Nov 15;7:198. doi: 10.3389/fgene.2016.00198. eCollection 2016.

Plant Cell Rep. 2016 Jun;35(6):1235-46. doi: 10.1007/s00299-016-1956-4. Epub 2016 Feb 23.

Genomics and bioinformatics resources for translational science in Rosaceae.

Plant Biotechnol Rep. 2014;8(2):49-64. doi: 10.1007/s11816-013-0282-3. Epub 2013 May 21.

Transcriptomic profiling during the post-harvest of heat-treated Dixiland Prunus persica fruits: common and distinct response to heat and cold.

PLoS One. 2012;7(12):e51052. doi: 10.1371/journal.pone.0051052. Epub 2012 Dec 6.

Metabolic profiling during peach fruit development and ripening reveals the metabolic networks that underpin each developmental stage.

Plant Physiol. 2011 Dec;157(4):1696-710. doi: 10.1104/pp.111.186064. Epub 2011 Oct 20.

New approaches to Prunus transcriptome analysis.

Genetica. 2011 Jun;139(6):755-69. doi: 10.1007/s10709-011-9580-2. Epub 2011 May 17.

Study of 'Redhaven' peach and its white-fleshed mutant suggests a key role of CCD4 carotenoid dioxygenase in carotenoid and norisoprenoid volatile metabolism.

BMC Plant Biol. 2011 Jan 26;11:24. doi: 10.1186/1471-2229-11-24.

Comparative analysis of rosaceous genomes and the reconstruction of a putative ancestral genome for the family.

BMC Evol Biol. 2011 Jan 12;11:9. doi: 10.1186/1471-2148-11-9.

Computational annotation of genes differentially expressed along olive fruit development.

BMC Plant Biol. 2009 Oct 24;9:128. doi: 10.1186/1471-2229-9-128.

Comparative EST transcript profiling of peach fruits under different post-harvest conditions reveals candidate genes associated with peach fruit quality.

BMC Genomics. 2009 Sep 10;10:423. doi: 10.1186/1471-2164-10-423.

本文引用的文献

ESTuber db: an online database for Tuber borchii EST sequences.

BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S13. doi: 10.1186/1471-2105-8-S1-S13.

GoSh: a web-based database for goat and sheep EST sequences.

Bioinformatics. 2007 Apr 15;23(8):1043-5. doi: 10.1093/bioinformatics/btm063. Epub 2007 Mar 24.

ESTree db: a tool for peach functional genomics.

BMC Bioinformatics. 2005 Dec 1;6 Suppl 4(Suppl 4):S16. doi: 10.1186/1471-2105-6-S4-S16.

Mapping with a few plants: using selective mapping for microsatellite saturation of the Prunus reference map.

Genetics. 2005 Nov;171(3):1305-9. doi: 10.1534/genetics.105.043661. Epub 2005 Aug 22.

GOblet: a platform for Gene Ontology annotation of anonymous sequence data.

Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W313-7. doi: 10.1093/nar/gkh406.

ScanProsite: a reference implementation of a PROSITE scanning tool.

Appl Bioinformatics. 2002;1(2):107-8.

TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets.

Bioinformatics. 2003 Mar 22;19(5):651-2. doi: 10.1093/bioinformatics/btg034.

Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP.

Bioinformatics. 2003 Feb 12;19(3):421-2. doi: 10.1093/bioinformatics/btf881.

The PROSITE database, its status in 2002.

Nucleic Acids Res. 2002 Jan 1;30(1):235-8. doi: 10.1093/nar/30.1.235.

DNA sequence quality trimming and vector removal.

Bioinformatics. 2001 Dec;17(12):1093-104. doi: 10.1093/bioinformatics/17.12.1093.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ESTree数据库的版本VI：用于桃转录组分析的改进工具。

Version VI of the ESTree db: an improved tool for peach transcriptome analysis.

作者信息

Lazzari Barbara, Caprera Andrea, Vecchietti Alberto, Merelli Ivan, Barale Francesca, Milanesi Luciano, Stella Alessandra, Pozzi Carlo

机构信息

Parco Tecnologico Padano, Via Einstein - Località Cascina Codazza, Lodi, 26900, Italy.