帮助：一种用于标记和预测人类普遍和特定情境必需基因的计算框架。

HELP: A computational framework for labelling and predicting human common and context-specific essential genes.

机构信息

Institute for High-Performance Computing and Networking, National Research Council, Naples, Italy.

Information Technology Services, University of Naples "L'Orientale", Naples, Italy.

出版信息

PLoS Comput Biol. 2024 Sep 27;20(9):e1012076. doi: 10.1371/journal.pcbi.1012076. eCollection 2024 Sep.

DOI:10.1371/journal.pcbi.1012076

PMID:39331694

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11463781/

Abstract

Machine learning-based approaches are particularly suitable for identifying essential genes as they allow the generation of predictive models trained on features from multi-source data. Gene essentiality is neither binary nor static but determined by the context. The databases for essential gene annotation do not permit the personalisation of the context, and their update can be slower than the publication of new experimental data. We propose HELP (Human Gene Essentiality Labelling & Prediction), a computational framework for labelling and predicting essential genes. Its double scope allows for identifying genes based on dependency or not on experimental data. The effectiveness of the labelling method was demonstrated by comparing it with other approaches in overlapping the reference sets of essential gene annotations, where HELP demonstrated the best compromise between false and true positive rates. The gene attributes, including multi-omics and network embedding features, lead to high-performance prediction of essential genes while confirming the existence of essentiality nuances.

摘要

基于机器学习的方法特别适合识别必需基因，因为它们可以生成基于多源数据特征训练的预测模型。基因的必需性既不是二进制的，也不是静态的，而是由上下文决定的。必需基因注释数据库不允许个性化上下文，并且它们的更新速度可能比新实验数据的发布速度慢。我们提出了 HELP（人类基因必需性标记和预测），这是一个用于标记和预测必需基因的计算框架。其双重范围允许根据实验数据确定基因的依赖性或非依赖性。通过将其与其他方法在必需基因注释的参考集中进行比较，证明了标记方法的有效性，其中 HELP 在假阳性率和真阳性率之间表现出最佳折衷。基因属性，包括多组学和网络嵌入特征，在确认必需性细微差别的同时，实现了对必需基因的高性能预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4af1/11463781/0bcd8008cfe5/pcbi.1012076.g001.jpg

相似文献

HELP: A computational framework for labelling and predicting human common and context-specific essential genes.帮助：一种用于标记和预测人类普遍和特定情境必需基因的计算框架。

PLoS Comput Biol. 2024 Sep 27;20(9):e1012076. doi: 10.1371/journal.pcbi.1012076. eCollection 2024 Sep.

Machine learning approach to gene essentiality prediction: a review.机器学习在基因必需性预测中的应用：综述。

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab128.

Training set selection for the prediction of essential genes.用于预测必需基因的训练集选择。

PLoS One. 2014 Jan 22;9(1):e86805. doi: 10.1371/journal.pone.0086805. eCollection 2014.

DeepHE: Accurately predicting human essential genes based on deep learning.DeepHE：基于深度学习的人类必需基因精准预测。

PLoS Comput Biol. 2020 Sep 16;16(9):e1008229. doi: 10.1371/journal.pcbi.1008229. eCollection 2020 Sep.

Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction.利用 PPI 网络自相关性在层次多标签分类树中进行基因功能预测。

BMC Bioinformatics. 2013 Sep 26;14:285. doi: 10.1186/1471-2105-14-285.

EPGAT: Gene Essentiality Prediction With Graph Attention Networks.EPGAT：基于图注意力网络的基因必需性预测。

IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1615-1626. doi: 10.1109/TCBB.2021.3054738. Epub 2022 Jun 3.

Novelty Indicator for Enhanced Prioritization of Predicted Gene Ontology Annotations.新型指标提高预测基因本体论注释的优先级。

IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):954-965. doi: 10.1109/TCBB.2017.2695459. Epub 2017 Apr 18.

Essential gene prediction using limited gene essentiality information-An integrative semi-supervised machine learning strategy.利用有限的基因必需性信息进行必需基因预测——一种综合的半监督机器学习策略。

PLoS One. 2020 Nov 30;15(11):e0242943. doi: 10.1371/journal.pone.0242943. eCollection 2020.

Network Embedding the Protein-Protein Interaction Network for Human Essential Genes Identification.网络嵌入蛋白质-蛋白质相互作用网络用于人类必需基因识别。

Genes (Basel). 2020 Jan 31;11(2):153. doi: 10.3390/genes11020153.

Software Suite for Gene and Protein Annotation Prediction and Similarity Search.用于基因和蛋白质注释预测及相似性搜索的软件套件。

IEEE/ACM Trans Comput Biol Bioinform. 2015 Jul-Aug;12(4):837-43. doi: 10.1109/TCBB.2014.2382127.

本文引用的文献

Untangling the Context-Specificity of Essential Genes by Means of Machine Learning: A Constructive Experience.通过机器学习理清必需基因的语境特异性：一种建设性的经验。

Biomolecules. 2023 Dec 22;14(1):18. doi: 10.3390/biom14010018.

Class imbalance should not throw you off balance: Choosing the right classifiers and performance metrics for brain decoding with imbalanced data.不要被类别不平衡问题困扰：选择合适的分类器和性能指标，对不平衡数据进行脑解码。

Neuroimage. 2023 Aug 15;277:120253. doi: 10.1016/j.neuroimage.2023.120253. Epub 2023 Jun 28.

The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest.2023 年的 STRING 数据库：针对任何感兴趣的测序基因组的蛋白质-蛋白质关联网络和功能富集分析。

Nucleic Acids Res. 2023 Jan 6;51(D1):D638-D646. doi: 10.1093/nar/gkac1000.

Database resources of the National Center for Biotechnology Information in 2023.2023 年国立生物技术信息中心的数据库资源。

Nucleic Acids Res. 2023 Jan 6;51(D1):D29-D38. doi: 10.1093/nar/gkac1032.

CRISPR/Cas9 a simple, inexpensive and effective technique for gene editing.CRISPR/Cas9 是一种简单、廉价且有效的基因编辑技术。

Mol Biol Rep. 2022 Jul;49(7):7079-7086. doi: 10.1007/s11033-022-07442-w. Epub 2022 Jun 18.

Nuclear and Cytoplasmatic Players in Mitochondria-Related CNS Disorders: Chromatin Modifications and Subcellular Trafficking.线粒体相关中枢神经系统疾病中的细胞核和细胞质作用因子：染色质修饰与亚细胞运输

Biomolecules. 2022 Apr 23;12(5):625. doi: 10.3390/biom12050625.

DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update).DAVID：一个用于基因列表功能富集分析和功能注释的网络服务器（2021 更新）。

Nucleic Acids Res. 2022 Jul 5;50(W1):W216-W221. doi: 10.1093/nar/gkac194.

Identifying essential genes across eukaryotes by machine learning.通过机器学习识别真核生物中的必需基因。

NAR Genom Bioinform. 2021 Nov 30;3(4):lqab110. doi: 10.1093/nargab/lqab110. eCollection 2021 Dec.

CoRe: a robustly benchmarked R package for identifying core-fitness genes in genome-wide pooled CRISPR-Cas9 screens.CoRe：一个在全基因组 CRISPR-Cas9 筛选中稳健基准的用于鉴定核心适应性基因的 R 包。

BMC Genomics. 2021 Nov 17;22(1):828. doi: 10.1186/s12864-021-08129-5.

IID 2021: towards context-specific protein interaction analyses by increased coverage, enhanced annotation and enrichment analysis.IID 2021：通过增加覆盖度、增强注释和富集分析实现针对具体上下文的蛋白质相互作用分析。

Nucleic Acids Res. 2022 Jan 7;50(D1):D640-D647. doi: 10.1093/nar/gkab1034.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

帮助：一种用于标记和预测人类普遍和特定情境必需基因的计算框架。

HELP: A computational framework for labelling and predicting human common and context-specific essential genes.

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献