Suppr超能文献

通过机器学习识别真核生物中的必需基因。

Identifying essential genes across eukaryotes by machine learning.

作者信息

Beder Thomas, Aromolaran Olufemi, Dönitz Jürgen, Tapanelli Sofia, Adedeji Eunice O, Adebiyi Ezekiel, Bucher Gregor, Koenig Rainer

机构信息

Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Am Klinikum 1, 07747 Jena, Germany.

Department of Computer & Information Sciences, Covenant University, Ota, Ogun State, Nigeria.

出版信息

NAR Genom Bioinform. 2021 Nov 30;3(4):lqab110. doi: 10.1093/nargab/lqab110. eCollection 2021 Dec.

Abstract

Identifying essential genes on a genome scale is resource intensive and has been performed for only a few eukaryotes. For less studied organisms essentiality might be predicted by gene homology. However, this approach cannot be applied to non-conserved genes. Additionally, divergent essentiality information is obtained from studying single cells or whole, multi-cellular organisms, and particularly when derived from human cell line screens and human population studies. We employed machine learning across six model eukaryotes and 60 381 genes, using 41 635 features derived from the sequence, gene function information and network topology. Within a leave-one-organism-out cross-validation, the classifiers showed high generalizability with an average accuracy close to 80% in the left-out species. As a case study, we applied the method to and and validated predictions experimentally yielding similar performances. Finally, using the classifier based on the studied model organisms enabled linking the essentiality information of human cell line screens and population studies.

摘要

在全基因组范围内鉴定必需基因需要大量资源,并且仅在少数真核生物中进行过。对于研究较少的生物,必需性可能通过基因同源性来预测。然而,这种方法不适用于非保守基因。此外,从单细胞或整个多细胞生物的研究中获得了不同的必需性信息,特别是当这些信息来自人类细胞系筛选和人类群体研究时。我们对六种模式真核生物和60381个基因采用了机器学习,使用了从序列、基因功能信息和网络拓扑结构中提取的41635个特征。在留一生物交叉验证中,分类器显示出很高的通用性,在留出的物种中平均准确率接近80%。作为一个案例研究,我们将该方法应用于[具体内容缺失]并通过实验验证了预测结果,得到了相似的性能。最后,使用基于所研究模式生物的分类器能够将人类细胞系筛选和群体研究的必需性信息联系起来。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验