Walton Thomas, Tsui Darin, Fogel Lauren, Huard Dustin J E, Chagas Rafael Siqueira, Lieberman Raquel L, Aghazadeh Amirali
School of Electrical and Computer Engineering, Georgia Institute of Technology.
School of Chemistry and Biochemistry, Georgia Institute of Technology.
bioRxiv. 2025 Jun 24:2025.06.17.660210. doi: 10.1101/2025.06.17.660210.
Missense mutations in the gene, particularly those affecting the olfactomedin (OLF) domain of the myocilin protein, can be causal for open-angle glaucoma-a leading cause of irreversible blindness. However, predicting the pathogenicity of these mutations remains challenging due to the complex effects of toxic gain-of-function variants and the scarcity of labeled clinical data. Herein, we present GOLF, a generative AI framework for assessing and explaining the pathogenicity of OLF domain variants. GOLF collects and curates a comprehensive dataset of OLF homologs and trains generative models that predict the effect of monoallelic missense mutations. While these models exhibit diverse predictive behaviors, they collectively achieve accurate classification of known pathogenic and benign variants. To interpret their decision mechanisms, GOLF uses a sparse autoencoder (SAE) that reveals the underlying biochemical features exploited by the generative models to predict variant effects. GOLF enables accurate evaluation of disease-causing mutations, supporting early genetic risk stratification for glaucoma and facilitating interpretable investigations into the molecular basis of pathogenic variants.
该基因中的错义突变,尤其是那些影响肌纤蛋白嗅觉介质(OLF)结构域的突变,可能是开角型青光眼的病因,开角型青光眼是不可逆失明的主要原因。然而,由于毒性功能获得性变体的复杂影响以及标记临床数据的稀缺,预测这些突变的致病性仍然具有挑战性。在此,我们展示了GOLF,这是一个用于评估和解释OLF结构域变体致病性的生成式人工智能框架。GOLF收集并整理了一个全面的OLF同源物数据集,并训练生成模型来预测单等位基因错义突变的影响。虽然这些模型表现出不同的预测行为,但它们共同实现了对已知致病和良性变体的准确分类。为了解释它们的决策机制,GOLF使用了一个稀疏自动编码器(SAE),该编码器揭示了生成模型用于预测变体效应的潜在生化特征。GOLF能够准确评估致病突变,支持青光眼的早期遗传风险分层,并促进对致病变体分子基础的可解释研究。