MoCHI：用于拟合可解释模型并从深度突变扫描数据中量化能量、能量耦合、上位性和变构的神经网络。

MoCHI: neural networks to fit interpretable models and quantify energies, energetic couplings, epistasis, and allostery from deep mutational scanning data.

作者信息

Faure Andre J, Lehner Ben

机构信息

Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.

Current Address: ALLOX, PRBB Building, C/Dr. Aiguader, 88, 08003, Barcelona, Spain.

出版信息

Genome Biol. 2024 Dec 2;25(1):303. doi: 10.1186/s13059-024-03444-y.

DOI:10.1186/s13059-024-03444-y

PMID:39617885

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11610129/

Abstract

We present MoCHI, a tool to fit interpretable models using deep mutational scanning data. MoCHI infers free energy changes, as well as interaction terms (energetic couplings) for specified biophysical models, including from multimodal phenotypic data. When a user-specified model is unavailable, global nonlinearities (epistasis) can be estimated from the data. MoCHI also leverages ensemble, background-averaged epistasis to learn sparse models that can incorporate higher-order epistatic terms. MoCHI is freely available as a Python package ( https://github.com/lehner-lab/MoCHI ) relying on the PyTorch machine learning framework and allows biophysical measurements at scale, including the construction of allosteric maps of proteins.

摘要

我们展示了MoCHI，这是一种使用深度突变扫描数据来拟合可解释模型的工具。MoCHI可以推断自由能变化以及指定生物物理模型的相互作用项（能量耦合），包括从多模态表型数据中推断。当用户指定的模型不可用时，可以从数据中估计全局非线性（上位性）。MoCHI还利用集成的、背景平均的上位性来学习可以纳入高阶上位性项的稀疏模型。MoCHI作为一个依赖于PyTorch机器学习框架的Python包（https://github.com/lehner-lab/MoCHI ）免费提供，并允许进行大规模的生物物理测量，包括构建蛋白质的别构图谱。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5992/11610129/94cfad2e139f/13059_2024_3444_Fig1_HTML.jpg

相似文献

MoCHI: neural networks to fit interpretable models and quantify energies, energetic couplings, epistasis, and allostery from deep mutational scanning data.MoCHI：用于拟合可解释模型并从深度突变扫描数据中量化能量、能量耦合、上位性和变构的神经网络。

Genome Biol. 2024 Dec 2;25(1):303. doi: 10.1186/s13059-024-03444-y.

A parameterized two-domain thermodynamic model explains diverse mutational effects on protein allostery.参数化双域热力学模型解释了蛋白质变构作用中多样化的突变效应。

Elife. 2024 Jun 5;12:RP92262. doi: 10.7554/eLife.92262.

popDMS infers mutation effects from deep mutational scanning data.popDMS 从深度突变扫描数据中推断突变效应。

Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae499.

Ensemble epistasis: thermodynamic origins of nonadditivity between mutations.组合上位性：突变之间非加性的热力学起源。

Genetics. 2021 Aug 26;219(1). doi: 10.1093/genetics/iyab105.

Interpretable modeling of genotype-phenotype landscapes with state-of-the-art predictive power.具有先进预测能力的基因型-表型景观的可解释建模。

Proc Natl Acad Sci U S A. 2022 Jun 28;119(26):e2114021119. doi: 10.1073/pnas.2114021119. Epub 2022 Jun 21.

Protocol for Construction of Genome-Wide Epistatic SNP Networks Using WISH-R Package.使用 WISH-R 包构建全基因组上位性 SNP 网络的方案。

Methods Mol Biol. 2021;2212:155-168. doi: 10.1007/978-1-0716-0947-7_10.

Mapping the energetic and allosteric landscapes of protein binding domains.绘制蛋白质结合域的能量和别构景观。

Nature. 2022 Apr;604(7904):175-183. doi: 10.1038/s41586-022-04586-4. Epub 2022 Apr 6.

CAPE: an R package for combined analysis of pleiotropy and epistasis.CAPE：用于多效性和上位性联合分析的 R 包。

PLoS Comput Biol. 2013 Oct;9(10):e1003270. doi: 10.1371/journal.pcbi.1003270. Epub 2013 Oct 24.

The genetic architecture of protein stability.蛋白质稳定性的遗传结构。

Nature. 2024 Oct;634(8035):995-1003. doi: 10.1038/s41586-024-07966-0. Epub 2024 Sep 25.

Inferring the shape of global epistasis.推断全球上位性的形状。

Proc Natl Acad Sci U S A. 2018 Aug 7;115(32):E7550-E7558. doi: 10.1073/pnas.1804015115. Epub 2018 Jul 23.

引用本文的文献

Learning sequence-function relationships with scalable, interpretable Gaussian processes.通过可扩展、可解释的高斯过程学习序列-函数关系。

bioRxiv. 2025 Aug 19:2025.08.15.670613. doi: 10.1101/2025.08.15.670613.

Cosmos: A Position-Resolution Causal Model for Direct and Indirect Effects in Protein Functions.《宇宙：蛋白质功能中直接和间接效应的位置分辨率因果模型》

bioRxiv. 2025 Aug 4:2025.08.01.667517. doi: 10.1101/2025.08.01.667517.

Variant scoring tools for deep mutational scanning.用于深度突变扫描的变异评分工具。

Mol Syst Biol. 2025 Aug 8. doi: 10.1038/s44320-025-00137-x.

On learning functions over biological sequence space: relating Gaussian process priors, regularization, and gauge fixing.关于生物序列空间上的学习函数：关联高斯过程先验、正则化和规范固定。

bioRxiv. 2025 Jul 11:2025.04.26.650699. doi: 10.1101/2025.04.26.650699.

ArXiv. 2025 Jul 11:arXiv:2504.19034v2.

Massively parallel genetic perturbation suggests the energetic structure of an amyloid-β transition state.大规模平行基因扰动揭示了淀粉样β蛋白过渡态的能量结构。

Sci Adv. 2025 Jun 13;11(24):eadv1422. doi: 10.1126/sciadv.adv1422. Epub 2025 Jun 11.

Inference and visualization of complex genotype-phenotype maps with .利用……对复杂基因型-表型图谱进行推断和可视化

bioRxiv. 2025 Mar 15:2025.03.09.642267. doi: 10.1101/2025.03.09.642267.

The Evolving Landscape of Protein Allostery: From Computational and Experimental Perspectives.蛋白质变构的演变态势：从计算和实验视角看

J Mol Biol. 2025 Mar 4:169060. doi: 10.1016/j.jmb.2025.169060.

Rewiring protein sequence and structure generative models to enhance protein stability prediction.重新调整蛋白质序列和结构生成模型以增强蛋白质稳定性预测。

bioRxiv. 2025 Feb 18:2025.02.13.638154. doi: 10.1101/2025.02.13.638154.

Machine learning in molecular biophysics: Protein allostery, multi-level free energy simulations, and lipid phase transitions.分子生物物理学中的机器学习：蛋白质别构、多级自由能模拟和脂质相变

Biophys Rev (Melville). 2025 Feb 12;6(1):011305. doi: 10.1063/5.0248589. eCollection 2025 Mar.

本文引用的文献

The genetic architecture of protein stability.蛋白质稳定性的遗传结构。

Nature. 2024 Oct;634(8035):995-1003. doi: 10.1038/s41586-024-07966-0. Epub 2024 Sep 25.

The simplicity of protein sequence-function relationships.蛋白质序列与功能关系的简单性。

Nat Commun. 2024 Sep 11;15(1):7953. doi: 10.1038/s41467-024-51895-5.

An extension of the Walsh-Hadamard transform to calculate and model epistasis in genetic landscapes of arbitrary shape and complexity.将 Walsh-Hadamard 变换扩展到计算和建模任意形状和复杂程度的遗传景观中的上位性。

PLoS Comput Biol. 2024 May 28;20(5):e1012132. doi: 10.1371/journal.pcbi.1012132. eCollection 2024 May.

Rosace: a robust deep mutational scanning analysis framework employing position and mean-variance shrinkage.蔷薇果状斑：一种采用位置和均值方差收缩的稳健深度突变扫描分析框架。

Genome Biol. 2024 May 24;25(1):138. doi: 10.1186/s13059-024-03279-7.

The energetic and allosteric landscape for KRAS inhibition.KRAS抑制的能量和变构格局。

Nature. 2024 Feb;626(7999):643-652. doi: 10.1038/s41586-023-06954-0. Epub 2023 Dec 18.

Learning protein fitness landscapes with deep mutational scanning data from multiple sources.利用来自多个来源的深度突变扫描数据学习蛋白质适应度景观。

Cell Syst. 2023 Aug 16;14(8):706-721.e5. doi: 10.1016/j.cels.2023.07.003.

mutscan-a flexible R package for efficient end-to-end analysis of multiplexed assays of variant effect data.mutscan-a 灵活的 R 包，用于高效地端到端分析变异效应数据的多路分析。

Genome Biol. 2023 Jun 1;24(1):132. doi: 10.1186/s13059-023-02967-0.

Interpretable modeling of genotype-phenotype landscapes with state-of-the-art predictive power.具有先进预测能力的基因型-表型景观的可解释建模。

Proc Natl Acad Sci U S A. 2022 Jun 28;119(26):e2114021119. doi: 10.1073/pnas.2114021119. Epub 2022 Jun 21.

MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect.MAVE-NN：从变异效应的多重分析中学习基因型-表型图谱。

Genome Biol. 2022 Apr 15;23(1):98. doi: 10.1186/s13059-022-02661-7.

Machine learning to navigate fitness landscapes for protein engineering.机器学习在蛋白质工程中的应用：探索适应度景观

Curr Opin Biotechnol. 2022 Jun;75:102713. doi: 10.1016/j.copbio.2022.102713. Epub 2022 Apr 9.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

MoCHI：用于拟合可解释模型并从深度突变扫描数据中量化能量、能量耦合、上位性和变构的神经网络。

MoCHI: neural networks to fit interpretable models and quantify energies, energetic couplings, epistasis, and allostery from deep mutational scanning data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献