Suppr超能文献

将 Walsh-Hadamard 变换扩展到计算和建模任意形状和复杂程度的遗传景观中的上位性。

An extension of the Walsh-Hadamard transform to calculate and model epistasis in genetic landscapes of arbitrary shape and complexity.

机构信息

Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain.

Universitat Pompeu Fabra (UPF), Barcelona, Spain.

出版信息

PLoS Comput Biol. 2024 May 28;20(5):e1012132. doi: 10.1371/journal.pcbi.1012132. eCollection 2024 May.

Abstract

Accurate models describing the relationship between genotype and phenotype are necessary in order to understand and predict how mutations to biological sequences affect the fitness and evolution of living organisms. The apparent abundance of epistasis (genetic interactions), both between and within genes, complicates this task and how to build mechanistic models that incorporate epistatic coefficients (genetic interaction terms) is an open question. The Walsh-Hadamard transform represents a rigorous computational framework for calculating and modeling epistatic interactions at the level of individual genotypic values (known as genetical, biological or physiological epistasis), and can therefore be used to address fundamental questions related to sequence-to-function encodings. However, one of its main limitations is that it can only accommodate two alleles (amino acid or nucleotide states) per sequence position. In this paper we provide an extension of the Walsh-Hadamard transform that allows the calculation and modeling of background-averaged epistasis (also known as ensemble epistasis) in genetic landscapes with an arbitrary number of states per position (20 for amino acids, 4 for nucleotides, etc.). We also provide a recursive formula for the inverse matrix and then derive formulae to directly extract any element of either matrix without having to rely on the computationally intensive task of constructing or inverting large matrices. Finally, we demonstrate the utility of our theory by using it to model epistasis within both simulated and empirical multiallelic fitness landscapes, revealing that both pairwise and higher-order genetic interactions are enriched between physically interacting positions.

摘要

为了理解和预测生物序列突变如何影响生物的适应性和进化,需要有准确的模型来描述基因型和表型之间的关系。基因之间和基因内部的遗传相互作用(上位性)显然很多,这使得构建包含上位性系数(遗传相互作用项)的机制模型变得复杂,如何构建这种模型仍是一个悬而未决的问题。沃尔什-哈达玛变换代表了一种严格的计算框架,可用于计算和模拟个体基因型值水平上的上位性相互作用(称为遗传、生物或生理上位性),因此可用于解决与序列到功能编码相关的基本问题。然而,它的一个主要限制是,它只能容纳每个序列位置的两个等位基因(氨基酸或核苷酸状态)。在本文中,我们提供了沃尔什-哈达玛变换的扩展,允许在每个位置具有任意数量状态(氨基酸 20 种,核苷酸 4 种等)的遗传景观中计算和模拟背景平均上位性(也称为集合上位性)。我们还提供了逆矩阵的递归公式,然后推导出直接提取任一矩阵元素的公式,而无需依赖于构建或反转大型矩阵的计算密集型任务。最后,我们通过使用它来模拟模拟和经验多等位基因适应度景观中的上位性,证明了我们理论的实用性,揭示了物理相互作用位置之间存在着丰富的成对和更高阶的遗传相互作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/80dc/11161127/074f9906e87b/pcbi.1012132.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验