Department of Veterinary Biosciences, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC 3010, Australia.
Bioinformatics Core Facility, Instituto Aggeu Magalhães, Fundação Oswaldo Cruz (IAM-Fiocruz), Recife 50740-465, PE, Brazil.
Int J Mol Sci. 2021 May 11;22(10):5056. doi: 10.3390/ijms22105056.
Experimental studies of and have contributed substantially to our understanding of molecular and cellular processes in metazoans at large. Since the publication of their genomes, functional genomic investigations have identified genes that are essential or non-essential for survival in each species. Recently, a range of features linked to gene essentiality have been inferred using a machine learning (ML)-based approach, allowing essentiality predictions within a species. Nevertheless, predictions between species are still elusive. Here, we undertake a comprehensive study using ML to discover and validate features of essential genes common to both and . We demonstrate that the cross-species prediction of gene essentiality is possible using a subset of features linked to nucleotide/protein sequences, protein orthology and subcellular localisation, single-cell RNA-seq, and histone methylation markers. Complementary analyses showed that essential genes are enriched for transcription and translation functions and are preferentially located away from heterochromatin regions of and chromosomes. The present work should enable the cross-prediction of essential genes between model and non-model metazoans.
实验研究在很大程度上促进了我们对后生动物分子和细胞过程的理解。自它们的基因组公布以来,功能基因组学的研究已经确定了每个物种生存所必需的或非必需的基因。最近,一种基于机器学习(ML)的方法推断了与基因必需性相关的一系列特征,从而可以在一个物种内进行必需性预测。然而,种间预测仍然难以捉摸。在这里,我们使用 ML 进行了一项全面的研究,以发现和验证 和 中共同的必需基因特征。我们证明,使用与核苷酸/蛋白质序列、蛋白质直系同源物和亚细胞定位、单细胞 RNA-seq 和组蛋白甲基化标记相关的特征子集,可以对基因必需性进行跨物种预测。补充分析表明,必需基因富含转录和翻译功能,并且优先位于 和 染色体异染色质区域之外。本工作应该能够在模型和非模型后生动物之间进行必需基因的交叉预测。