Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA.
Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA.
Science. 2023 Apr 28;380(6643):eabm7993. doi: 10.1126/science.abm7993.
Protein-coding differences between species often fail to explain phenotypic diversity, suggesting the involvement of genomic elements that regulate gene expression such as enhancers. Identifying associations between enhancers and phenotypes is challenging because enhancer activity can be tissue-dependent and functionally conserved despite low sequence conservation. We developed the Tissue-Aware Conservation Inference Toolkit (TACIT) to associate candidate enhancers with species' phenotypes using predictions from machine learning models trained on specific tissues. Applying TACIT to associate motor cortex and parvalbumin-positive interneuron enhancers with neurological phenotypes revealed dozens of enhancer-phenotype associations, including brain size-associated enhancers that interact with genes implicated in microcephaly or macrocephaly. TACIT provides a foundation for identifying enhancers associated with the evolution of any convergently evolved phenotype in any large group of species with aligned genomes.
物种间的蛋白质编码差异往往无法解释表型多样性,这表明调控基因表达的基因组元件(如增强子)的参与。由于增强子的活性可能依赖于组织,并且尽管序列保守性低,但功能仍然保守,因此鉴定增强子与表型之间的关联具有挑战性。我们开发了 Tissue-Aware Conservation Inference Toolkit(TACIT),该工具使用针对特定组织训练的机器学习模型的预测,将候选增强子与物种的表型相关联。应用 TACIT 将运动皮层和 Parvalbumin 阳性中间神经元的增强子与神经表型相关联,揭示了数十个增强子-表型关联,包括与涉及小头畸形或大头畸形的基因相互作用的与大脑大小相关的增强子。TACIT 为鉴定与任何具有对齐基因组的大型物种群体中任何趋同进化表型的进化相关的增强子提供了基础。