Zhang Peifen, Berardini Tanya Z, Ebert Dustin, Li Qian, Mi Huaiyu, Muruganujan Anushya, Prithvi Trilok, Reiser Leonore, Sawant Swapnil, Thomas Paul D, Huala Eva
Phoenix Bioinformatics Fremont CA USA.
Department of Preventive Medicine University of Southern California Los Angeles CA USA.
Plant Direct. 2020 Dec 30;4(12):e00293. doi: 10.1002/pld3.293. eCollection 2020 Dec.
We aim to enable the accurate and efficient transfer of knowledge about gene function gained from and other model organisms to other plant species. This knowledge transfer is frequently challenging in plants due to duplications of individual genes and whole genomes in plant lineages. Such duplications result in complex evolutionary relationships between related genes, which may have similar sequences but highly divergent functions. In such cases, functional inference requires more than a simple sequence similarity calculation. We have developed an online resource, PhyloGenes (phylogenes.org), that displays precomputed phylogenetic trees for plant gene families along with experimentally validated function information for individual genes within the families. A total of 40 plant genomes and 10 non-plant model organisms are represented in over 8,000 gene families. Evolutionary events such as speciation and duplication are clearly labeled on gene trees to distinguish orthologs from paralogs. Nearly 6,000 families have at least one member with an experimentally supported annotation to a Gene Ontology (GO) molecular function or biological process term. By displaying experimentally validated gene functions associated to individual genes within a tree, PhyloGenes enables functional inference for genes of uncharacterized function, based on their evolutionary relationships to experimentally studied genes, in a visually traceable manner. For the many families containing genes that have evolved to perform different functions, PhyloGenes facilitates the use of evolutionary history to determine the most likely function of genes that have not been experimentally characterized. Future work will enrich the resource by incorporating additional gene function datasets such as plant gene expression atlas data.
我们的目标是实现将从[未提及的其他模型生物]和其他模式生物中获得的有关基因功能的知识准确、高效地转移到其他植物物种。由于植物谱系中单个基因和整个基因组的重复,这种知识转移在植物中常常具有挑战性。这种重复导致相关基因之间形成复杂的进化关系,这些基因可能具有相似的序列但功能却高度不同。在这种情况下,功能推断需要的不仅仅是简单的序列相似性计算。我们开发了一个在线资源PhyloGenes(phylogenes.org),它展示了针对植物基因家族预先计算好的系统发育树,以及家族内单个基因经过实验验证的功能信息。超过8000个基因家族涵盖了总共40个植物基因组和10个非植物模式生物。物种形成和基因复制等进化事件在基因树上有清晰的标注,以区分直系同源基因和平行同源基因。近6000个家族至少有一个成员具有经实验支持的基因本体论(GO)分子功能或生物学过程术语注释。通过在树中展示与单个基因相关的经实验验证的基因功能,PhyloGenes能够基于未表征功能基因与经实验研究基因的进化关系,以一种可视化且可追溯的方式对其进行功能推断。对于许多包含已进化以执行不同功能的基因的家族,PhyloGenes有助于利用进化历史来确定尚未经过实验表征的基因最可能的功能。未来的工作将通过纳入额外的基因功能数据集,如植物基因表达图谱数据,来丰富该资源。