Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier, France.
French Institute of Bioinformatics (IFB)-South Green Bioinformatics Platform, Bioversity, CIRAD, INRAE, IRD, F-34398 Montpellier France.
Nucleic Acids Res. 2021 Jan 8;49(D1):D1464-D1471. doi: 10.1093/nar/gkaa1068.
Comparative genomics is the analysis of genomic relationships among different species and serves as a significant base for evolutionary and functional genomic studies. GreenPhylDB (https://www.greenphyl.org) is a database designed to facilitate the exploration of gene families and homologous relationships among plant genomes, including staple crops critically important for global food security. GreenPhylDB is available since 2007, after the release of the Arabidopsis thaliana and Oryza sativa genomes and has undergone multiple releases. With the number of plant genomes currently available, it becomes challenging to select a single reference for comparative genomics studies but there is still a lack of databases taking advantage several genomes by species for orthology detection. GreenPhylDBv5 introduces the concept of comparative pangenomics by harnessing multiple genome sequences by species. We created 19 pangenes and processed them with other species still relying on one genome. In total, 46 plant species were considered to build gene families and predict their homologous relationships through phylogenetic-based analyses. In addition, since the previous publication, we rejuvenated the website and included a new set of original tools including protein-domain combination, tree topologies searches and a section for users to store their own results in order to support community curation efforts.
比较基因组学是对不同物种基因组关系的分析,是进化和功能基因组研究的重要基础。GreenPhylDB(https://www.greenphyl.org)是一个数据库,旨在促进植物基因组中基因家族和同源关系的探索,包括对全球粮食安全至关重要的主要作物。GreenPhylDB 自 2007 年拟南芥和水稻基因组发布后开始使用,已经经历了多次版本更新。随着目前植物基因组数量的增加,选择单个参考基因组进行比较基因组学研究变得具有挑战性,但仍缺乏利用多个物种的多个基因组进行同源检测的数据库。GreenPhylDBv5 通过利用物种的多个基因组引入了比较泛基因组学的概念。我们创建了 19 个泛基因组,并对其进行了处理,而其他物种仍然依赖于一个基因组。总共考虑了 46 个植物物种,以通过基于系统发育的分析构建基因家族并预测它们的同源关系。此外,自上次发布以来,我们对网站进行了更新,并包含了一组新的原始工具,包括蛋白质结构域组合、树拓扑搜索以及一个供用户存储自己结果的部分,以支持社区策展工作。