最小互作插值法用于序列-功能关系研究。

Minimum epistasis interpolation for sequence-function relationships.

机构信息

Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.

出版信息

Nat Commun. 2020 Apr 14;11(1):1782. doi: 10.1038/s41467-020-15512-5.

DOI:10.1038/s41467-020-15512-5

PMID:32286265

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7156698/

Abstract

Massively parallel phenotyping assays have provided unprecedented insight into how multiple mutations combine to determine biological function. While such assays can measure phenotypes for thousands to millions of genotypes in a single experiment, in practice these measurements are not exhaustive, so that there is a need for techniques to impute values for genotypes whose phenotypes have not been directly assayed. Here, we present an imputation method based on inferring the least epistatic possible sequence-function relationship compatible with the data. In particular, we infer the reconstruction where mutational effects change as little as possible across adjacent genetic backgrounds. The resulting models can capture complex higher-order genetic interactions near the data, but approach additivity where data is sparse or absent. We apply the method to high-throughput transcription factor binding assays and use it to explore a fitness landscape for protein G.

摘要

大规模并行表型分析为研究多种突变如何组合来决定生物功能提供了前所未有的见解。虽然此类分析可以在单次实验中测量数千到数百万种基因型的表型，但实际上这些测量并非详尽无遗，因此需要有技术来推断那些表型尚未直接测定的基因型的值。在这里，我们提出了一种基于推断与数据兼容的最小上位性可能的序列-功能关系的推断方法。具体来说，我们推断出突变效应在相邻遗传背景下变化最小的重建。由此产生的模型可以捕捉到数据附近复杂的高阶遗传相互作用，但在数据稀疏或不存在的情况下接近加性。我们将该方法应用于高通量转录因子结合分析，并使用它来探索蛋白质 G 的适应度景观。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a595/7156698/dd5e11e9b2a8/41467_2020_15512_Fig1_HTML.jpg

相似文献

Minimum epistasis interpolation for sequence-function relationships.最小互作插值法用于序列-功能关系研究。

Nat Commun. 2020 Apr 14;11(1):1782. doi: 10.1038/s41467-020-15512-5.

A systematic survey of an intragenic epistatic landscape.基因内上位性景观的系统调查。

Mol Biol Evol. 2015 Jan;32(1):229-38. doi: 10.1093/molbev/msu301. Epub 2014 Nov 3.

Inferring fitness landscapes by regression produces biased estimates of epistasis.通过回归推断适应度景观会产生偏倚的上位性估计。

Proc Natl Acad Sci U S A. 2014 Jun 3;111(22):E2301-9. doi: 10.1073/pnas.1400849111. Epub 2014 May 19.

Inferring genetic interactions from comparative fitness data.从比较适合度数据推断遗传相互作用。

Elife. 2017 Dec 20;6:e28629. doi: 10.7554/eLife.28629.

Learning epistatic gene interactions from perturbation screens.从扰动筛选中学习上位基因相互作用。

PLoS One. 2021 Jul 13;16(7):e0254491. doi: 10.1371/journal.pone.0254491. eCollection 2021.

Genotype networks of 80 quantitative Arabidopsis thaliana phenotypes reveal phenotypic evolvability despite pervasive epistasis.80 种拟南芥表型的基因型网络揭示了表型可进化性，尽管存在普遍的上位性。

PLoS Comput Biol. 2020 Aug 13;16(8):e1008082. doi: 10.1371/journal.pcbi.1008082. eCollection 2020 Aug.

High-order epistasis shapes evolutionary trajectories.高阶上位性塑造进化轨迹。

PLoS Comput Biol. 2017 May 15;13(5):e1005541. doi: 10.1371/journal.pcbi.1005541. eCollection 2017 May.

An extension of the Walsh-Hadamard transform to calculate and model epistasis in genetic landscapes of arbitrary shape and complexity.将 Walsh-Hadamard 变换扩展到计算和建模任意形状和复杂程度的遗传景观中的上位性。

PLoS Comput Biol. 2024 May 28;20(5):e1012132. doi: 10.1371/journal.pcbi.1012132. eCollection 2024 May.

Analysis of epistatic interactions and fitness landscapes using a new geometric approach.使用一种新的几何方法分析上位性相互作用和适应度景观。

BMC Evol Biol. 2007 Apr 13;7:60. doi: 10.1186/1471-2148-7-60.

Higher-order epistasis and phenotypic prediction.高阶上位性与表型预测。

Proc Natl Acad Sci U S A. 2022 Sep 27;119(39):e2204233119. doi: 10.1073/pnas.2204233119. Epub 2022 Sep 21.

引用本文的文献

Learning sequence-function relationships with scalable, interpretable Gaussian processes.通过可扩展、可解释的高斯过程学习序列-函数关系。

bioRxiv. 2025 Aug 19:2025.08.15.670613. doi: 10.1101/2025.08.15.670613.

Risk prediction models for biochemical recurrence of Chinese prostate cancer patients after radical prostatectomy based on magnetic resonance imaging examination: a systematic review.基于磁共振成像检查的中国前列腺癌患者根治性前列腺切除术后生化复发风险预测模型：一项系统综述

Quant Imaging Med Surg. 2025 Sep 1;15(9):8648-8662. doi: 10.21037/qims-2024-2843. Epub 2025 Aug 13.

Investigating the determinants of performance in machine learning for protein fitness prediction.研究蛋白质适应性预测机器学习中性能的决定因素。

Protein Sci. 2025 Aug;34(8):e70235. doi: 10.1002/pro.70235.

On learning functions over biological sequence space: relating Gaussian process priors, regularization, and gauge fixing.关于生物序列空间上的学习函数：关联高斯过程先验、正则化和规范固定。

bioRxiv. 2025 Jul 11:2025.04.26.650699. doi: 10.1101/2025.04.26.650699.

ArXiv. 2025 Jul 11:arXiv:2504.19034v2.

Inference and visualization of complex genotype-phenotype maps with .利用……对复杂基因型-表型图谱进行推断和可视化

bioRxiv. 2025 Mar 15:2025.03.09.642267. doi: 10.1101/2025.03.09.642267.

Gauge fixing for sequence-function relationships.序列-功能关系的规范固定

PLoS Comput Biol. 2025 Mar 20;21(3):e1012818. doi: 10.1371/journal.pcbi.1012818. eCollection 2025.

Scaling the fitness effects of mutations with respect to differentially adapted Arabidopsis thaliana accessions under natural conditions.衡量自然条件下不同适应性的拟南芥生态型中突变的适应性效应

Evolution. 2025 Jun 14;79(6):951-961. doi: 10.1093/evolut/qpaf029.

Importance of higher-order epistasis in large protein sequence-function relationships.高阶上位性在大型蛋白质序列-功能关系中的重要性。

bioRxiv. 2024 Sep 24:2024.09.22.614318. doi: 10.1101/2024.09.22.614318.

The genetic architecture of protein stability.蛋白质稳定性的遗传结构。

Nature. 2024 Oct;634(8035):995-1003. doi: 10.1038/s41586-024-07966-0. Epub 2024 Sep 25.

本文引用的文献

Learning the pattern of epistasis linking genotype and phenotype in a protein.学习将基因型与表型联系起来的上位性模式的蛋白质。

Nat Commun. 2019 Sep 16;10(1):4213. doi: 10.1038/s41467-019-12130-8.

Protein stability engineering insights revealed by domain-wide comprehensive mutagenesis.通过全域全面突变揭示蛋白质稳定性工程的见解。

Proc Natl Acad Sci U S A. 2019 Aug 13;116(33):16367-16377. doi: 10.1073/pnas.1903888116. Epub 2019 Aug 1.

Machine-learning-guided directed evolution for protein engineering.基于机器学习的定向进化蛋白质工程。

Nat Methods. 2019 Aug;16(8):687-694. doi: 10.1038/s41592-019-0496-6. Epub 2019 Jul 15.

Massively Parallel Assays and Quantitative Sequence-Function Relationships.大规模平行分析与定量序列功能关系。

Annu Rev Genomics Hum Genet. 2019 Aug 31;20:99-127. doi: 10.1146/annurev-genom-083118-014845. Epub 2019 May 15.

The Causes and Consequences of Genetic Interactions (Epistasis).遗传相互作用（上位性）的原因和后果。

Annu Rev Genomics Hum Genet. 2019 Aug 31;20:433-460. doi: 10.1146/annurev-genom-083118-014857. Epub 2019 May 13.

An experimental assay of the interactions of amino acids from orthologous sequences shaping a complex fitness landscape.对形成复杂适应景观的同源序列氨基酸相互作用的实验分析。

PLoS Genet. 2019 Apr 10;15(4):e1008079. doi: 10.1371/journal.pgen.1008079. eCollection 2019 Apr.

Biophysical Inference of Epistasis and the Effects of Mutations on Protein Stability and Function.生物物理推断上位性以及突变对蛋白质稳定性和功能的影响。

Mol Biol Evol. 2018 Oct 1;35(10):2345-2354. doi: 10.1093/molbev/msy141.

Inferring the shape of global epistasis.推断全球上位性的形状。

Proc Natl Acad Sci U S A. 2018 Aug 7;115(32):E7550-E7558. doi: 10.1073/pnas.1804015115. Epub 2018 Jul 23.

Evolutionary constraints in fitness landscapes.适应度景观中的进化约束。

Heredity (Edinb). 2018 Nov;121(5):466-481. doi: 10.1038/s41437-018-0110-1. Epub 2018 Jul 11.

The Influence of Higher-Order Epistasis on Biological Fitness Landscape Topography.高阶上位性对生物适合度景观地形的影响。

J Stat Phys. 2018;172(1):208-225. doi: 10.1007/s10955-018-1975-3. Epub 2018 Feb 7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

最小互作插值法用于序列-功能关系研究。

Minimum epistasis interpolation for sequence-function relationships.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献