• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于预测原核生物必需基因的集成机器学习模型。

An integrated machine-learning model to predict prokaryotic essential genes.

作者信息

Deng Jingyuan

机构信息

Division of Epidemiology and Biostatistics, Department of Environmental Health, Cincinnati Children's Hospital, University of Cincinnati Medical Center, 3223 Eden Avenue, ML 56, Cincinnati, OH, 45267-0056, USA,

出版信息

Methods Mol Biol. 2015;1279:137-51. doi: 10.1007/978-1-4939-2398-4_9.

DOI:10.1007/978-1-4939-2398-4_9
PMID:25636617
Abstract

Essential genes are indispensable for the target organism's survival. Large-scale identification and characterization of essential genes has shown to be beneficial in both fundamental biology and medicine fields. Current existing genome-scale experimental screenings of essential genes are time consuming and costly, also sometimes confer erroneous essential gene annotations. To circumvent these difficulties, many research groups turn to computational approaches as the alternative to identify essential genes. Here, we developed an integrative machine-learning based statistical framework to accurately predict essential genes in microorganisms. First we extracted a variety of relevant features derived from different aspects of an organism's genomic sequences. Then we selected a subset of features have high predictive power of gene essentiality through a carefully designed feature selection system. Using the selected features as input, we constructed an ensemble classifier and trained the model on a well-studied microorganism. After fine-tuning the model parameters in cross-validation, we tested the model on the other microorganism. We found that the tenfold cross-validation results within the same organism achieves a high predictive accuracy (AUC ~0.9), and cross-organism predictions between distant related organisms yield the AUC scores from 0.69 to 0.89, which significantly outperformed homology mapping.

摘要

必需基因对于目标生物体的生存不可或缺。对必需基因进行大规模鉴定和表征已证明在基础生物学和医学领域都有益处。当前现有的必需基因全基因组规模实验筛选既耗时又昂贵,有时还会给出错误的必需基因注释。为了规避这些困难,许多研究团队转向计算方法作为鉴定必需基因的替代方法。在此,我们开发了一种基于机器学习的综合统计框架,以准确预测微生物中的必需基因。首先,我们从生物体基因组序列的不同方面提取了各种相关特征。然后,我们通过精心设计的特征选择系统选择了一组对基因必需性具有高预测能力的特征子集。使用所选特征作为输入,我们构建了一个集成分类器,并在一种经过充分研究的微生物上训练模型。在交叉验证中对模型参数进行微调后,我们在其他微生物上测试了该模型。我们发现,在同一生物体内进行的十折交叉验证结果具有很高的预测准确性(AUC约为0.9),并且在远缘相关生物体之间进行的跨生物体预测产生的AUC分数在0.69至0.89之间,这显著优于同源性映射。

相似文献

1
An integrated machine-learning model to predict prokaryotic essential genes.一种用于预测原核生物必需基因的集成机器学习模型。
Methods Mol Biol. 2015;1279:137-51. doi: 10.1007/978-1-4939-2398-4_9.
2
Investigating the predictability of essential genes across distantly related organisms using an integrative approach.利用综合方法研究远缘生物中必需基因的可预测性。
Nucleic Acids Res. 2011 Feb;39(3):795-807. doi: 10.1093/nar/gkq784. Epub 2010 Sep 24.
3
Towards the identification of essential genes using targeted genome sequencing and comparative analysis.利用靶向基因组测序和比较分析鉴定必需基因
BMC Genomics. 2006 Oct 19;7:265. doi: 10.1186/1471-2164-7-265.
4
Predicting essential genes in fungal genomes.预测真菌基因组中的必需基因。
Genome Res. 2006 Sep;16(9):1126-35. doi: 10.1101/gr.5144106. Epub 2006 Aug 9.
5
Predicting essential genes for identifying potential drug targets in Aspergillus fumigatus.预测烟曲霉中用于识别潜在药物靶点的必需基因。
Comput Biol Chem. 2014 Jun;50:29-40. doi: 10.1016/j.compbiolchem.2014.01.011. Epub 2014 Jan 23.
6
Analysis and identification of essential genes in humans using topological properties and biological information.利用拓扑性质和生物信息分析与鉴定人类必需基因。
Gene. 2014 Nov 10;551(2):138-51. doi: 10.1016/j.gene.2014.08.046. Epub 2014 Aug 27.
7
Machine learning approach to gene essentiality prediction: a review.机器学习在基因必需性预测中的应用:综述。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab128.
8
Predicting essential genes of 37 prokaryotes by combining information-theoretic features.通过结合信息论特征预测37种原核生物的必需基因。
J Microbiol Methods. 2021 Sep;188:106297. doi: 10.1016/j.mimet.2021.106297. Epub 2021 Jul 31.
9
Predicting essential genes in prokaryotic genomes using a linear method: ZUPLS.使用线性方法ZUPLS预测原核生物基因组中的必需基因。
Integr Biol (Camb). 2014 Apr;6(4):460-9. doi: 10.1039/c3ib40241j. Epub 2014 Mar 7.
10
Identifying essential genes in bacterial metabolic networks with machine learning methods.运用机器学习方法识别细菌代谢网络中的必需基因。
BMC Syst Biol. 2010 May 3;4:56. doi: 10.1186/1752-0509-4-56.

引用本文的文献

1
Essential genes identification model based on sequence feature map and graph convolutional neural network.基于序列特征图和图卷积神经网络的必需基因识别模型。
BMC Genomics. 2024 Jan 10;25(1):47. doi: 10.1186/s12864-024-09958-w.
2
Identification of discriminant features from stationary pattern of nucleotide bases and their application to essential gene classification.从核苷酸碱基的固定模式中识别判别特征及其在必需基因分类中的应用。
Front Genet. 2023 Apr 20;14:1154120. doi: 10.3389/fgene.2023.1154120. eCollection 2023.
3
A Computational Framework Based on Ensemble Deep Neural Networks for Essential Genes Identification.
基于集成深度神经网络的必需基因识别计算框架。
Int J Mol Sci. 2020 Nov 28;21(23):9070. doi: 10.3390/ijms21239070.
4
Identifying mouse developmental essential genes using machine learning.利用机器学习识别小鼠发育必需基因。
Dis Model Mech. 2018 Dec 13;11(12):dmm034546. doi: 10.1242/dmm.034546.
5
Essential genome of Campylobacter jejuni.空肠弯曲菌的必需基因组。
BMC Genomics. 2017 Aug 14;18(1):616. doi: 10.1186/s12864-017-4032-8.
6
Comparison of 432 Pseudomonas strains through integration of genomic, functional, metabolic and expression data.通过整合基因组、功能、代谢和表达数据对 432 株假单胞菌进行比较。
Sci Rep. 2016 Dec 6;6:38699. doi: 10.1038/srep38699.