Division of Mathematics and Science, Walsh University, North Canton, OH 44720, USA.
HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA.
Genes (Basel). 2023 Jan 14;14(1):222. doi: 10.3390/genes14010222.
The SOX transcription factor family is pivotal in controlling aspects of development. To identify genotype-phenotype relationships of SOX proteins, we performed a non-biased study of SOX using 1890 open-reading frame and 6667 amino acid sequences in combination with structural dynamics to interpret 3999 gnomAD, 485 ClinVar, 1174 Geno2MP, and 4313 COSMIC human variants. We identified, within the HMG (High Mobility Group)- box, twenty-seven amino acids with changes in multiple SOX proteins annotated to clinical pathologies. These sites were screened through Geno2MP medical phenotypes, revealing novel SOX15 R104G associated with musculature abnormality and SOX8 R159G with intellectual disability. Within gnomAD, SOX18 E137K (rs201931544), found within the HMG box of ~0.8% of Latinx individuals, is associated with seizures and neurological complications, potentially through blood-brain barrier alterations. A total of 56 highly conserved variants were found at sites outside the HMG-box, including several within the SOX2 HMG-box-flanking region with neurological associations, several in the SOX9 dimerization region associated with Campomelic Dysplasia, SOX14 K88R (rs199932938) flanking the HMG box associated with cardiovascular complications within European populations, and SOX7 A379V (rs143587868) within an SOXF conserved far C-terminal domain heterozygous in 0.716% of African individuals with associated eye phenotypes. This SOX data compilation builds a robust genotype-to-phenotype association for a gene family through more robust ortholog data integration.
SOX 转录因子家族在控制发育方面起着关键作用。为了确定 SOX 蛋白的基因型-表型关系,我们使用 1890 个开放阅读框和 6667 个氨基酸序列,结合结构动力学,对 3999 个 gnomAD、485 个 ClinVar、1174 个 Geno2MP 和 4313 个 COSMIC 人类变异进行了无偏研究。我们在 HMG(高迁移率族)-盒中鉴定了 27 个氨基酸,这些氨基酸在注释为临床病理学的多种 SOX 蛋白中发生了变化。这些位点通过 Geno2MP 医学表型进行筛选,揭示了与肌肉异常相关的新型 SOX15 R104G 和与智力障碍相关的 SOX8 R159G。在 gnomAD 中,SOX18 E137K(rs201931544),位于 HMG 盒内约 0.8%的拉丁裔个体中,与癫痫和神经并发症相关,可能通过血脑屏障改变。在 HMG 盒外的位点共发现了 56 个高度保守的变异,包括 SOX2 HMG 盒侧翼区域的几个与神经相关的变异,SOX9 二聚化区域的几个与 Camptomelic 发育不良相关的变异,SOX14 K88R(rs199932938)侧翼 HMG 盒与欧洲人群心血管并发症相关,以及 SOX7 A379V(rs143587868)在非洲个体中杂合的 SOXF 保守远 C 端结构域内,与眼部表型相关。这个 SOX 数据汇集通过更强大的同源数据集成,为一个基因家族建立了稳健的基因型-表型关联。