• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

预测和解释遗传干扰和相互作用对生物个体生存能力的影响。

Predicting and explaining the impact of genetic disruptions and interactions on organismal viability.

机构信息

Food and Nutrition Program, Kuwait Institute for Scientific Research, Safat 13109, Kuwait.

Systems and Software Development Department, Kuwait Institute for Scientific Research, Safat 13109, Kuwait.

出版信息

Bioinformatics. 2022 Sep 2;38(17):4088-4099. doi: 10.1093/bioinformatics/btac519.

DOI:10.1093/bioinformatics/btac519
PMID:35861390
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9438956/
Abstract

MOTIVATION

Existing computational models can predict single- and double-mutant fitness but they do have limitations. First, they are often tested via evaluation metrics that are inappropriate for imbalanced datasets. Second, all of them only predict a binary outcome (viable or not, and negatively interacting or not). Third, most are uninterpretable black box machine learning models.

RESULTS

Budding yeast datasets were used to develop high-performance Multinomial Regression (MN) models capable of predicting the impact of single, double and triple genetic disruptions on viability. These models are interpretable and give realistic non-binary predictions and can predict negative genetic interactions (GIs) in triple-gene knockouts. They are based on a limited set of gene features and their predictions are influenced by the probability of target gene participating in molecular complexes or pathways. Furthermore, the MN models have utility in other organisms such as fission yeast, fruit flies and humans, with the single gene fitness MN model being able to distinguish essential genes necessary for cell-autonomous viability from those required for multicellular survival. Finally, our models exceed the performance of previous models, without sacrificing interpretability.

AVAILABILITY AND IMPLEMENTATION

All code and processed datasets used to generate results and figures in this manuscript are available at our Github repository at https://github.com/KISRDevelopment/cell_viability_paper. The repository also contains a link to the GI prediction website that lets users search for GIs using the MN models.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

现有的计算模型可以预测单突变体和双突变体的适应度,但它们确实存在一些局限性。首先,它们通常通过不适合不平衡数据集的评估指标进行测试。其次,它们都只能预测二进制结果(可行或不可行,以及是否存在负相互作用)。第三,大多数都是不可解释的黑盒机器学习模型。

结果

我们使用酿酒酵母数据集开发了高性能多项回归(MN)模型,能够预测单、双和三基因突变对生存能力的影响。这些模型是可解释的,给出了现实的非二进制预测,并可以预测三基因敲除中的负遗传相互作用(GI)。它们基于一组有限的基因特征,其预测受到目标基因参与分子复合物或途径的概率的影响。此外,MN 模型在其他生物体中也具有实用性,例如裂殖酵母、果蝇和人类,单基因适应度 MN 模型能够区分对细胞自主生存至关重要的基因与对多细胞生存所必需的基因。最后,我们的模型在不牺牲可解释性的情况下,超过了以前模型的性能。

可用性和实施

本研究手稿中生成结果和图的所有代码和处理数据集都可在我们的 Github 存储库中获得,网址为 https://github.com/KISRDevelopment/cell_viability_paper。该存储库还包含一个 GI 预测网站的链接,用户可以使用 MN 模型搜索 GI。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d33e/9438956/023b8f6608f9/btac519f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d33e/9438956/5a35dd95af26/btac519f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d33e/9438956/0adf2d2b2582/btac519f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d33e/9438956/17ca71832e3f/btac519f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d33e/9438956/bd8924daaee5/btac519f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d33e/9438956/023b8f6608f9/btac519f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d33e/9438956/5a35dd95af26/btac519f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d33e/9438956/0adf2d2b2582/btac519f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d33e/9438956/17ca71832e3f/btac519f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d33e/9438956/bd8924daaee5/btac519f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d33e/9438956/023b8f6608f9/btac519f5.jpg

相似文献

1
Predicting and explaining the impact of genetic disruptions and interactions on organismal viability.预测和解释遗传干扰和相互作用对生物个体生存能力的影响。
Bioinformatics. 2022 Sep 2;38(17):4088-4099. doi: 10.1093/bioinformatics/btac519.
2
Expression-based prediction of human essential genes and candidate lncRNAs in cancer cells.基于表达谱的人类必需基因和癌症细胞中候选 lncRNA 的预测。
Bioinformatics. 2021 Apr 20;37(3):396-403. doi: 10.1093/bioinformatics/btaa717.
3
LinkExplorer: predicting, explaining and exploring links in large biomedical knowledge graphs.LinkExplorer:在大型生物医学知识图谱中预测、解释和探索链接。
Bioinformatics. 2022 Apr 12;38(8):2371-2373. doi: 10.1093/bioinformatics/btac068.
4
Predictive and interpretable models via the stacked elastic net.基于堆叠弹性网络的预测和可解释模型。
Bioinformatics. 2021 Aug 4;37(14):2012-2016. doi: 10.1093/bioinformatics/btaa535.
5
PyGenePlexus: a Python package for gene discovery using network-based machine learning.PyGenePlexus:一个使用基于网络的机器学习进行基因发现的 Python 包。
Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad064.
6
ECMarker: interpretable machine learning model identifies gene expression biomarkers predicting clinical outcomes and reveals molecular mechanisms of human disease in early stages.ECMarker:可解释的机器学习模型,用于识别预测临床结果的基因表达生物标志物,并揭示人类疾病早期的分子机制。
Bioinformatics. 2021 May 23;37(8):1115-1124. doi: 10.1093/bioinformatics/btaa935.
7
A comprehensive evaluation of regression-based drug responsiveness prediction models, using cell viability inhibitory concentrations (IC50 values).基于细胞活力抑制浓度(IC50 值)的回归型药物反应性预测模型的综合评估。
Bioinformatics. 2022 May 13;38(10):2810-2817. doi: 10.1093/bioinformatics/btac177.
8
Scaling tree-based automated machine learning to biomedical big data with a feature set selector.使用特征集选择器将基于树的自动化机器学习扩展到生物医学大数据。
Bioinformatics. 2020 Jan 1;36(1):250-256. doi: 10.1093/bioinformatics/btz470.
9
Matrix (factorization) reloaded: flexible methods for imputing genetic interactions with cross-species and side information.矩阵(因子分解)再思考:使用跨物种和辅助信息进行遗传交互作用推断的灵活方法。
Bioinformatics. 2020 Dec 30;36(Suppl_2):i866-i874. doi: 10.1093/bioinformatics/btaa818.
10
DTI-Voodoo: machine learning over interaction networks and ontology-based background knowledge predicts drug-target interactions.DTI-Voodoo:基于机器学习的交互网络和基于本体论的背景知识预测药物-靶点相互作用。
Bioinformatics. 2021 Dec 11;37(24):4835-4843. doi: 10.1093/bioinformatics/btab548.

引用本文的文献

1
Complex synthetic lethality in cancer.癌症中的复杂合成致死性。
Nat Genet. 2023 Dec;55(12):2039-2048. doi: 10.1038/s41588-023-01557-x. Epub 2023 Nov 30.

本文引用的文献

1
Next-generation sequencing technologies: An overview.下一代测序技术:概述
Hum Immunol. 2021 Nov;82(11):801-811. doi: 10.1016/j.humimm.2021.02.012. Epub 2021 Mar 19.
2
Predicting gene essentiality in by feature engineering and machine-learning.通过特征工程和机器学习预测基因必需性。 (你提供的原文“Predicting gene essentiality in by feature engineering and machine-learning.”似乎不完整,“in”后面缺少具体内容,但按照要求进行了现有内容的翻译。)
Comput Struct Biotechnol J. 2020 May 15;18:1093-1102. doi: 10.1016/j.csbj.2020.05.008. eCollection 2020.
3
The mutational constraint spectrum quantified from variation in 141,456 humans.
从 141456 名人类个体的变异中量化的突变约束谱。
Nature. 2020 May;581(7809):434-443. doi: 10.1038/s41586-020-2308-7. Epub 2020 May 27.
4
Pooled CRISPR Screens in Drosophila Cells.果蝇细胞中的汇集式CRISPR筛选
Curr Protoc Mol Biol. 2019 Dec;129(1):e111. doi: 10.1002/cpmb.111.
5
An Evaluation of Machine Learning Approaches for the Prediction of Essential Genes in Eukaryotes Using Protein Sequence-Derived Features.使用蛋白质序列衍生特征对真核生物中必需基因进行预测的机器学习方法评估
Comput Struct Biotechnol J. 2019 Jun 8;17:785-796. doi: 10.1016/j.csbj.2019.05.008. eCollection 2019.
6
Predicting synthetic lethal interactions using conserved patterns in protein interaction networks.利用蛋白质相互作用网络中的保守模式预测合成致死相互作用。
PLoS Comput Biol. 2019 Apr 17;15(4):e1006888. doi: 10.1371/journal.pcbi.1006888. eCollection 2019 Apr.
7
Global Genetic Networks and the Genotype-to-Phenotype Relationship.全球基因网络与基因型-表型关系。
Cell. 2019 Mar 21;177(1):85-100. doi: 10.1016/j.cell.2019.01.033.
8
Analyzing a co-occurrence gene-interaction network to identify disease-gene association.分析共发生基因相互作用网络以识别疾病-基因关联。
BMC Bioinformatics. 2019 Feb 8;20(1):70. doi: 10.1186/s12859-019-2634-7.
9
The BioGRID interaction database: 2019 update.生物相互作用数据库(BioGRID):2019 年更新版。
Nucleic Acids Res. 2019 Jan 8;47(D1):D529-D541. doi: 10.1093/nar/gky1079.
10
The Gene Ontology Resource: 20 years and still GOing strong.《基因本体论资源:20 年,持续强大》
Nucleic Acids Res. 2019 Jan 8;47(D1):D330-D338. doi: 10.1093/nar/gky1055.