Clark Nathan L, Hittinger Chris Todd, Li-Byarlay Hongmei, Rokas Antonis, Sackton Timothy B, Unckless Robert L
Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA.
Laboratory of Genetics, DOE Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53706, USA.
Integr Comp Biol. 2025 Jul 23;65(1):63-73. doi: 10.1093/icb/icaf037.
A major goal of research in evolution and genetics is linking genotype to phenotype. This work could be direct, such as determining the genetic basis of a phenotype by leveraging genetic variation or divergence in a developmental, physiological, or behavioral trait. The work could also involve studying the evolutionary phenomena (e.g., reproductive isolation, adaptation, sexual dimorphism, behavior) that reveal an indirect link between genotype and a trait of interest. When the phenotype diverges across evolutionarily distinct lineages, this genotype-to-phenotype problem can be addressed using phylogenetic genotype-to-phenotype (PhyloG2P) mapping, which uses genetic signatures and convergent phenotypes on a phylogeny to infer the genetic bases of traits. The PhyloG2P approach has proven powerful in revealing key genetic changes associated with diverse traits, including the mammalian transition to marine environments and transitions between major mechanisms of photosynthesis. However, there are several intermediate traits layered in between genotype and the phenotype of interest, including but not limited to transcriptional profiles, chromatin states, protein abundances, structures, modifications, metabolites, and physiological parameters. Each intermediate trait is interesting and informative in its own right, but synthesis across data types has great promise for providing a deep, integrated, and predictive understanding of how genotypes drive phenotypic differences and convergence. We argue that an expanded PhyloG2P framework (the PhyloG2P matrix) that explicitly considers intermediate traits, and imputes those that are prohibitive to obtain, will allow a better mechanistic understanding of any trait of interest. This approach provides a proxy for functional validation and mechanistic understanding in organisms where laboratory manipulation is impractical.
进化与遗传学研究的一个主要目标是将基因型与表型联系起来。这项工作可以是直接的,比如通过利用发育、生理或行为特征中的遗传变异或差异来确定表型的遗传基础。这项工作也可能涉及研究那些揭示基因型与感兴趣的性状之间间接联系的进化现象(例如,生殖隔离、适应、两性异形、行为)。当表型在进化上不同的谱系中出现差异时,可以使用系统发育基因型到表型(PhyloG2P)映射来解决这个基因型到表型的问题,该方法利用系统发育树上的遗传特征和趋同表型来推断性状的遗传基础。PhyloG2P方法已被证明在揭示与各种性状相关的关键遗传变化方面很强大,包括哺乳动物向海洋环境的转变以及光合作用主要机制之间的转变。然而,在基因型和感兴趣的表型之间存在几个中间性状,包括但不限于转录谱、染色质状态、蛋白质丰度、结构、修饰、代谢物和生理参数。每个中间性状本身都很有趣且信息丰富,但跨数据类型的综合分析对于深入、全面且具有预测性地理解基因型如何驱动表型差异和趋同具有很大的前景。我们认为,一个明确考虑中间性状并推算那些难以获得的中间性状的扩展PhyloG2P框架(PhyloG2P矩阵),将有助于更好地从机制上理解任何感兴趣的性状。这种方法为在实验室操作不切实际的生物体中进行功能验证和机制理解提供了一个替代方案。