Suppr超能文献

从少量测量的表型推断完整的基因型-表型图谱。

Inferring a complete genotype-phenotype map from a small number of measured phenotypes.

机构信息

Institute for Molecular Biology, University of Oregon, Eugene, OR, United States of America.

Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR, United States of America.

出版信息

PLoS Comput Biol. 2020 Sep 29;16(9):e1008243. doi: 10.1371/journal.pcbi.1008243. eCollection 2020 Sep.

Abstract

Understanding evolution requires detailed knowledge of genotype-phenotype maps; however, it can be a herculean task to measure every phenotype in a combinatorial map. We have developed a computational strategy to predict the missing phenotypes from an incomplete, combinatorial genotype-phenotype map. As a test case, we used an incomplete genotype-phenotype dataset previously generated for the malaria parasite's 'chloroquine resistance transporter' (PfCRT). Wild-type PfCRT (PfCRT3D7) lacks significant chloroquine (CQ) transport activity, but the introduction of the eight mutations present in the 'Dd2' isoform of PfCRT (PfCRTDd2) enables the protein to transport CQ away from its site of antimalarial action. This gain of a transport function imparts CQ resistance to the parasite. A combinatorial map between PfCRT3D7 and PfCRTDd2 consists of 256 genotypes, of which only 52 have had their CQ transport activities measured through expression in the Xenopus laevis oocyte. We trained a statistical model with these 52 measurements to infer the CQ transport activity for the remaining 204 combinatorial genotypes between PfCRT3D7 and PfCRTDd2. Our best-performing model incorporated a binary classifier, a nonlinear scale, and additive effects for each mutation. The addition of specific pairwise- and high-order-epistatic coefficients decreased the predictive power of the model. We evaluated our predictions by experimentally measuring the CQ transport activities of 24 additional PfCRT genotypes. The R2 value between our predicted and newly-measured phenotypes was 0.90. We then used the model to probe the accessibility of evolutionary trajectories through the map. Approximately 1% of the possible trajectories between PfCRT3D7 and PfCRTDd2 are accessible; however, none of the trajectories entailed eight successive increases in CQ transport activity. These results demonstrate that phenotypes can be inferred with known uncertainty from a partial genotype-phenotype dataset. We also validated our approach against a collection of previously published genotype-phenotype maps. The model therefore appears general and should be applicable to a large number of genotype-phenotype maps.

摘要

理解进化需要详细了解基因型-表型图谱;然而,要测量组合图谱中的每一个表型可能是一项艰巨的任务。我们开发了一种计算策略,从不完全的组合基因型-表型图谱中预测缺失的表型。作为一个测试案例,我们使用了以前为疟疾寄生虫“氯喹抗性转运蛋白”(PfCRT)生成的不完全基因型-表型数据集。野生型 PfCRT(PfCRT3D7)缺乏显著的氯喹(CQ)转运活性,但引入 PfCRTDd2 同工型中存在的 8 个突变,使该蛋白能够将 CQ 从其抗疟作用部位转运出去。这种转运功能的获得赋予寄生虫对 CQ 的抗性。PfCRT3D7 和 PfCRTDd2 之间的组合图谱由 256 种基因型组成,其中只有 52 种的 CQ 转运活性通过在非洲爪蟾卵母细胞中的表达进行了测量。我们使用这 52 个测量值训练了一个统计模型,以推断 PfCRT3D7 和 PfCRTDd2 之间剩余的 204 种组合基因型的 CQ 转运活性。表现最佳的模型纳入了二进制分类器、非线性比例和每个突变的加性效应。添加特定的成对和高阶上位性系数会降低模型的预测能力。我们通过实验测量 24 种额外 PfCRT 基因型的 CQ 转运活性来评估我们的预测。我们预测的和新测量的表型之间的 R2 值为 0.90。然后,我们使用该模型探测通过图谱的进化轨迹的可达性。PfCRT3D7 和 PfCRTDd2 之间大约有 1%的可能轨迹是可达的;然而,没有一条轨迹涉及到 CQ 转运活性的连续 8 次增加。这些结果表明,可以从部分基因型-表型数据集中以已知的不确定性推断表型。我们还针对以前发表的基因型-表型图谱集合验证了我们的方法。因此,该模型似乎是通用的,应该适用于大量的基因型-表型图谱。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6633/7546491/815450a339d7/pcbi.1008243.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验