• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从表达数据预测基因敲除效应。

Predicting gene knockout effects from expression data.

机构信息

School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.

Department of Genetics, The Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel.

出版信息

BMC Med Genomics. 2023 Feb 18;16(1):26. doi: 10.1186/s12920-023-01446-6.

DOI:10.1186/s12920-023-01446-6
PMID:36803845
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9938619/
Abstract

BACKGROUND

The study of gene essentiality, which measures the importance of a gene for cell division and survival, is used for the identification of cancer drug targets and understanding of tissue-specific manifestation of genetic conditions. In this work, we analyze essentiality and gene expression data from over 900 cancer lines from the DepMap project to create predictive models of gene essentiality.

METHODS

We developed machine learning algorithms to identify those genes whose essentiality levels are explained by the expression of a small set of "modifier genes". To identify these gene sets, we developed an ensemble of statistical tests capturing linear and non-linear dependencies. We trained several regression models predicting the essentiality of each target gene, and used an automated model selection procedure to identify the optimal model and hyperparameters. Overall, we examined linear models, gradient boosted trees, Gaussian process regression models, and deep learning networks.

RESULTS

We identified nearly 3000 genes for which we accurately predict essentiality using gene expression data of a small set of modifier genes. We show that both in the number of genes we successfully make predictions for, as well as in the prediction accuracy, our model outperforms current state-of-the-art works.

CONCLUSIONS

Our modeling framework avoids overfitting by identifying the small set of modifier genes, which are of clinical and genetic importance, and ignores the expression of noisy and irrelevant genes. Doing so improves the accuracy of essentiality prediction in various conditions and provides interpretable models. Overall, we present an accurate computational approach, as well as interpretable modeling of essentiality in a wide range of cellular conditions, thus contributing to a better understanding of the molecular mechanisms that govern tissue-specific effects of genetic disease and cancer.

摘要

背景

基因必需性研究用于鉴定癌症药物靶点和理解遗传疾病在组织中的特异性表现,其衡量了基因对细胞分裂和存活的重要性。在这项工作中,我们分析了来自 DepMap 项目的 900 多个癌细胞系的必需性和基因表达数据,以创建基因必需性的预测模型。

方法

我们开发了机器学习算法来识别那些其必需性水平由一小部分“修饰基因”的表达解释的基因。为了识别这些基因集,我们开发了一个捕捉线性和非线性依赖关系的统计测试集成。我们训练了几个回归模型来预测每个靶基因的必需性,并使用自动模型选择程序来识别最优模型和超参数。总体而言,我们检查了线性模型、梯度提升树、高斯过程回归模型和深度学习网络。

结果

我们确定了近 3000 个基因,我们可以使用一小部分修饰基因的表达数据准确预测这些基因的必需性。我们表明,无论是成功预测的基因数量,还是预测准确性,我们的模型都优于当前的最新技术。

结论

我们的建模框架通过识别具有临床和遗传重要性的少量修饰基因,避免了过度拟合,同时忽略了嘈杂和不相关基因的表达。这样做提高了各种条件下必需性预测的准确性,并提供了可解释的模型。总的来说,我们提出了一种准确的计算方法,以及在广泛的细胞条件下对必需性的可解释建模,从而有助于更好地理解控制遗传疾病和癌症组织特异性效应的分子机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e12e/9938619/f33dc659b6ce/12920_2023_1446_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e12e/9938619/231a76e396f2/12920_2023_1446_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e12e/9938619/fe23f3639d41/12920_2023_1446_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e12e/9938619/ef9e68795334/12920_2023_1446_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e12e/9938619/9ebc68715962/12920_2023_1446_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e12e/9938619/f33dc659b6ce/12920_2023_1446_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e12e/9938619/231a76e396f2/12920_2023_1446_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e12e/9938619/fe23f3639d41/12920_2023_1446_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e12e/9938619/ef9e68795334/12920_2023_1446_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e12e/9938619/9ebc68715962/12920_2023_1446_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e12e/9938619/f33dc659b6ce/12920_2023_1446_Fig5_HTML.jpg

相似文献

1
Predicting gene knockout effects from expression data.从表达数据预测基因敲除效应。
BMC Med Genomics. 2023 Feb 18;16(1):26. doi: 10.1186/s12920-023-01446-6.
2
Machine learning approach to gene essentiality prediction: a review.机器学习在基因必需性预测中的应用:综述。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab128.
3
A Machine Learning Approach for Predicting Essentiality of Metabolic Genes.基于机器学习的代谢基因必需性预测方法。
Methods Mol Biol. 2024;2760:345-369. doi: 10.1007/978-1-0716-3658-9_20.
4
ELIMINATOR: essentiality analysis using multisystem networks and integer programming.ELIMINATOR:使用多系统网络和整数规划进行必需性分析。
BMC Bioinformatics. 2022 Aug 6;23(1):324. doi: 10.1186/s12859-022-04855-z.
5
Essentiality, protein-protein interactions and evolutionary properties are key predictors for identifying cancer-associated genes using machine learning.必需性、蛋白质-蛋白质相互作用和进化特性是使用机器学习识别癌症相关基因的关键预测指标。
Sci Rep. 2024 Apr 22;14(1):9199. doi: 10.1038/s41598-023-44118-2.
6
EPGAT: Gene Essentiality Prediction With Graph Attention Networks.EPGAT:基于图注意力网络的基因必需性预测。
IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1615-1626. doi: 10.1109/TCBB.2021.3054738. Epub 2022 Jun 3.
7
An Evaluation of Machine Learning Approaches for the Prediction of Essential Genes in Eukaryotes Using Protein Sequence-Derived Features.使用蛋白质序列衍生特征对真核生物中必需基因进行预测的机器学习方法评估
Comput Struct Biotechnol J. 2019 Jun 8;17:785-796. doi: 10.1016/j.csbj.2019.05.008. eCollection 2019.
8
Robust evaluation of deep learning-based representation methods for survival and gene essentiality prediction on bulk RNA-seq data.基于深度学习的代表性方法在批量 RNA-seq 数据上的生存和基因必需性预测的稳健评估。
Sci Rep. 2024 Jul 24;14(1):17064. doi: 10.1038/s41598-024-67023-8.
9
Combined gene essentiality scoring improves the prediction of cancer dependency maps.联合基因必需性评分提高了癌症依赖性图谱的预测能力。
EBioMedicine. 2019 Dec;50:67-80. doi: 10.1016/j.ebiom.2019.10.051. Epub 2019 Nov 12.
10
Genome-wide investigation of gene-cancer associations for the prediction of novel therapeutic targets in oncology.全基因组范围内的基因-癌症关联研究,用于预测肿瘤学中的新型治疗靶点。
Sci Rep. 2020 Jul 1;10(1):10787. doi: 10.1038/s41598-020-67846-1.

引用本文的文献

1
Flexynesis: A deep learning toolkit for bulk multi-omics data integration for precision oncology and beyond.Flexynesis:用于精准肿瘤学及其他领域的批量多组学数据整合的深度学习工具包。
Nat Commun. 2025 Sep 12;16(1):8261. doi: 10.1038/s41467-025-63688-5.
2
Accelerating crop improvement via integration of transcriptome-based network biology and genome editing.通过整合基于转录组的网络生物学和基因组编辑加速作物改良。
Planta. 2025 Mar 17;261(4):92. doi: 10.1007/s00425-025-04666-5.
3
Refining computational inference of gene regulatory networks: integrating knockout data within a multi-task framework.

本文引用的文献

1
Efficient querying of genomic reference databases with gget.使用 gget 高效查询基因组参考数据库。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac836.
2
Chronos: a cell population dynamics model of CRISPR experiments that improves inference of gene fitness effects.Chronos:一个用于 CRISPR 实验的细胞群体动力学模型,可提高基因适合度效应推断的准确性。
Genome Biol. 2021 Dec 20;22(1):343. doi: 10.1186/s13059-021-02540-7.
3
A human cell atlas of fetal gene expression.人类胎儿基因表达细胞图谱。
细化基因调控网络的计算推断:在多任务框架内整合敲除数据。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae361.
4
Robust evaluation of deep learning-based representation methods for survival and gene essentiality prediction on bulk RNA-seq data.基于深度学习的代表性方法在批量 RNA-seq 数据上的生存和基因必需性预测的稳健评估。
Sci Rep. 2024 Jul 24;14(1):17064. doi: 10.1038/s41598-024-67023-8.
5
CAP-RNAseq: an integrated pipeline for functional annotation and prioritization of co-expression clusters.CAP-RNAseq:用于共表达簇功能注释和优先级排序的综合流程
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbad536.
Science. 2020 Nov 13;370(6518). doi: 10.1126/science.aba7721.
4
Prediction of cancer dependencies from expression data using deep learning.利用深度学习从表达数据预测癌症依赖性。
Mol Omics. 2021 Feb 1;17(1):66-71. doi: 10.1039/d0mo00042f. Epub 2020 Nov 2.
5
Mechanisms of tissue and cell-type specificity in heritable traits and diseases.遗传性状和疾病的组织和细胞类型特异性的机制。
Nat Rev Genet. 2020 Mar;21(3):137-150. doi: 10.1038/s41576-019-0200-9. Epub 2020 Jan 8.
6
An efficient gene selection method for microarray data based on LASSO and BPSO.基于 LASSO 和 BPSO 的微阵列数据高效基因选择方法。
BMC Bioinformatics. 2019 Dec 30;20(Suppl 22):715. doi: 10.1186/s12859-019-3228-0.
7
Role of duplicate genes in determining the tissue-selectivity of hereditary diseases.重复基因在遗传性疾病组织选择性中的作用。
PLoS Genet. 2018 May 3;14(5):e1007327. doi: 10.1371/journal.pgen.1007327. eCollection 2018 May.
8
A Community Challenge for Inferring Genetic Predictors of Gene Essentialities through Analysis of a Functional Screen of Cancer Cell Lines.通过分析癌症细胞系的功能筛选来推断基因必需性的遗传预测因子的社区挑战。
Cell Syst. 2017 Nov 22;5(5):485-497.e3. doi: 10.1016/j.cels.2017.09.004. Epub 2017 Oct 4.
9
Defining a Cancer Dependency Map.定义癌症依赖图谱。
Cell. 2017 Jul 27;170(3):564-576.e16. doi: 10.1016/j.cell.2017.06.010.
10
Copy-number and gene dependency analysis reveals partial copy loss of wild-type SF3B1 as a novel cancer vulnerability.拷贝数和基因依赖性分析揭示野生型SF3B1的部分拷贝缺失是一种新的癌症易感性。
Elife. 2017 Feb 8;6:e23268. doi: 10.7554/eLife.23268.