Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec H3A 0G4, Canada.
Montreal Neurological Institute-Hospital, McGill University, Montreal, Quebec H3A 2B4, Canada.
Brain. 2023 Nov 2;146(11):4608-4621. doi: 10.1093/brain/awad224.
Within recent years, there has been a growing number of genes associated with amyotrophic lateral sclerosis (ALS), resulting in an increasing number of novel variants, particularly missense variants, many of which are of unknown clinical significance. Here, we leverage the sequencing efforts of the ALS Knowledge Portal (3864 individuals with ALS and 7839 controls) and Project MinE ALS Sequencing Consortium (4366 individuals with ALS and 1832 controls) to perform proteomic and transcriptomic characterization of missense variants in 24 ALS-associated genes. The two sequencing datasets were interrogated for missense variants in the 24 genes, and variants were annotated with gnomAD minor allele frequencies, ClinVar pathogenicity classifications, protein sequence features including Uniprot functional site annotations, and PhosphoSitePlus post-translational modification site annotations, structural features from AlphaFold predicted monomeric 3D structures, and transcriptomic expression levels from Genotype-Tissue Expression. We then applied missense variant enrichment and gene-burden testing following binning of variation based on the selected proteomic and transcriptomic features to identify those most relevant to pathogenicity in ALS-associated genes. Using predicted human protein structures from AlphaFold, we determined that missense variants carried by individuals with ALS were significantly enriched in β-sheets and α-helices, as well as in core, buried or moderately buried regions. At the same time, we identified that hydrophobic amino acid residues, compositionally biased protein regions and regions of interest are predominantly enriched in missense variants carried by individuals with ALS. Assessment of expression level based on transcriptomics also revealed enrichment of variants of high and medium expression across all tissues and within the brain. We further explored enriched features of interest using burden analyses and identified individual genes were indeed driving certain enrichment signals. A case study is presented for SOD1 to demonstrate proof-of-concept of how enriched features may aid in defining variant pathogenicity. Our results present proteomic and transcriptomic features that are important indicators of missense variant pathogenicity in ALS and are distinct from features associated with neurodevelopmental disorders.
近年来,与肌萎缩侧索硬化症(ALS)相关的基因数量不断增加,导致新的变异体数量不断增加,特别是错义变异体,其中许多变异体的临床意义尚不清楚。在这里,我们利用 ALS 知识门户(3864 名 ALS 患者和 7839 名对照)和 MinE ALS 测序联盟项目(4366 名 ALS 患者和 1832 名对照)的测序工作,对 24 个与 ALS 相关基因中的错义变异体进行蛋白质组学和转录组学特征分析。对这两个测序数据集进行了 24 个基因中错义变异体的检测,并根据 gnomAD 次要等位基因频率、ClinVar 致病性分类、包括 Uniprot 功能位点注释在内的蛋白质序列特征、PhosphoSitePlus 翻译后修饰位点注释、来自 AlphaFold 预测单体 3D 结构的结构特征以及来自 Genotype-Tissue Expression 的转录组表达水平对变异体进行注释。然后,我们根据选定的蛋白质组学和转录组学特征对变异进行分组后,应用错义变异体富集和基因负担测试,以确定与 ALS 相关基因中致病性最相关的变异体。使用 AlphaFold 预测的人类蛋白质结构,我们确定 ALS 患者携带的错义变异体在β-折叠和α-螺旋以及核心、埋藏或中等埋藏区域中显著富集。同时,我们发现疏水性氨基酸残基、组成上偏向的蛋白质区域和感兴趣的区域主要富集在 ALS 患者携带的错义变异体中。基于转录组学的表达水平评估也揭示了所有组织和大脑内高表达和中表达变异体的富集。我们进一步使用负担分析探索了感兴趣的富集特征,并确定个别基因确实驱动了某些富集信号。对 SOD1 进行了案例研究,以证明富集特征如何有助于定义变异体致病性的概念验证。我们的结果提供了 ALS 中错义变异体致病性的重要蛋白质组学和转录组学特征,这些特征与神经发育障碍相关的特征不同。