Carrillo Alexander J, Cabrera Ilva E, Spasojevic Marko J, Schacht Patrick, Stajich Jason E, Borkovich Katherine A
Department of Microbiology and Plant Pathology, University of California, 900 University Avenue, Riverside, CA, 92521, USA.
Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, California, 92521, USA.
BMC Genomics. 2020 Nov 2;21(1):755. doi: 10.1186/s12864-020-07131-7.
With 9730 protein-coding genes and a nearly complete gene knockout strain collection, Neurospora crassa is a major model organism for filamentous fungi. Despite this abundance of information, the phenotypes of these gene knockout mutants have not been categorized to determine whether there are broad correlations between phenotype and any genetic features.
Here, we analyze data for 10 different growth or developmental phenotypes that have been obtained for 1168 N. crassa knockout mutants. Of these mutants, 265 (23%) are in the normal range, while 903 (77%) possess at least one mutant phenotype. With the exception of unclassified functions, the distribution of functional categories for genes in the mutant dataset mirrors that of the N. crassa genome. In contrast, most genes do not possess a yeast ortholog, suggesting that our analysis will reveal functions that are not conserved in Saccharomyces cerevisiae. To leverage the phenotypic data to identify pathways, we used weighted Partitioning Around Medoids (PAM) approach with 40 clusters. We found that genes encoding metabolic, transmembrane and protein phosphorylation-related genes are concentrated in subsets of clusters. Results from K-Means clustering of transcriptomic datasets showed that most phenotypic clusters contain multiple expression profiles, suggesting that co-expression is not generally observed for genes with shared phenotypes. Analysis of yeast orthologs of genes that co-clustered in MAPK signaling cascades revealed potential networks of interacting proteins in N. crassa.
Our results demonstrate that clustering analysis of phenotypes is a promising tool for generating new hypotheses regarding involvement of genes in cellular pathways in N. crassa. Furthermore, information about gene clusters identified in N. crassa should be applicable to other filamentous fungi, including saprobes and pathogens.
粗糙脉孢菌有9730个蛋白质编码基因和近乎完整的基因敲除菌株库,是丝状真菌的主要模式生物。尽管有如此丰富的信息,但这些基因敲除突变体的表型尚未分类,以确定表型与任何遗传特征之间是否存在广泛的相关性。
在这里,我们分析了1168个粗糙脉孢菌敲除突变体的10种不同生长或发育表型的数据。在这些突变体中,265个(23%)处于正常范围,而903个(77%)至少具有一种突变表型。除了未分类的功能外,突变数据集中基因的功能类别分布反映了粗糙脉孢菌基因组的分布。相比之下,大多数基因没有酵母直系同源物,这表明我们的分析将揭示在酿酒酵母中不保守的功能。为了利用表型数据来识别途径,我们使用了具有40个聚类的加权围绕中心点划分(PAM)方法。我们发现编码代谢、跨膜和蛋白质磷酸化相关基因的基因集中在聚类子集中。转录组数据集的K均值聚类结果表明,大多数表型聚类包含多个表达谱,这表明具有共享表型的基因通常不共表达。对在MAPK信号级联中共聚类的基因的酵母直系同源物的分析揭示了粗糙脉孢菌中潜在的相互作用蛋白网络。
我们的结果表明,表型聚类分析是一种有前途的工具,可用于生成关于粗糙脉孢菌细胞途径中基因参与情况的新假设。此外,在粗糙脉孢菌中鉴定出的基因簇信息应该适用于其他丝状真菌,包括腐生菌和病原体。