Zwaenepoel Arthur, Diels Tim, Amar David, Van Parys Thomas, Shamir Ron, Van de Peer Yves, Tzfadia Oren
Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
VIB Center for Plant Systems Biology, Ghent, Belgium.
Front Plant Sci. 2018 Mar 19;9:352. doi: 10.3389/fpls.2018.00352. eCollection 2018.
Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest.
近年来,“组学”数据呈爆炸式增长,从功能角度来看,高通量基因表达数据可能是其中最重要的。尽管在基因序列功能分类的计算技术方面有了巨大进步,但基于常见相似性的方法往往无法提供完整可靠的功能信息。最近,比较基因组学与功能基因组学方法的结合在基因功能分析方面引起了广泛关注,它利用了基于基因表达的关联推断方法以及在亲缘关系密切的模式生物中的注释工作。除了识别途径中缺失的基因外,这些方法通常还能发现生物调节因子(即转录因子或信号基因)。一种先前构建的关联推断方法是MORPH,它被证明是一种高效算法,在识别和优先排序植物代谢途径中缺失的基因方面表现尤为出色。在这里,我们展示了MorphDB,这是一个整合了多种植物物种中基于MORPH的大规模功能注释(基因本体论、MapMan分类)候选基因的资源库。除了以基因为中心的查询工具外,我们还展示了一种比较网络方法,使研究人员能够有效地浏览跨功能基因集和物种的MORPH预测结果,促进高效的基因发现和候选基因的优先排序。MorphDB可在http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/获取。我们还提供了一个名为“MORPH批量”(https://github.com/arzwa/morph-bulk)的工具包,用于在新数据集上以批量模式运行MORPH,使研究人员能够将MORPH应用于他们自己感兴趣的物种。