Breimann Stephan, Kamp Frits, Basset Gabriele, Abou-Ajram Claudia, Güner Gökhan, Yanagida Kanta, Okochi Masayasu, Müller Stephan A, Lichtenthaler Stefan F, Langosch Dieter, Frishman Dmitrij, Steiner Harald
Biomedical Center (BMC), Division of Metabolic Biochemistry, Faculty of Medicine, LMU Munich, München, Germany.
German Center for Neurodegenerative Diseases (DZNE), DZNE Munich, München, Germany.
Nat Commun. 2025 Jul 1;16(1):5428. doi: 10.1038/s41467-025-60638-z.
Proteases recognize substrates by decoding sequence information-an essential cellular process elusive when recognition motifs are absent. Here, we unravel this problem for γ-secretase, an intramembrane-cleaving protease associated with Alzheimer's disease and cancer, by developing Comparative Physicochemical Profiling (CPP), a sequence-based algorithm for identifying interpretable physicochemical features. We show that CPP deciphers a γ-secretase substrate signature with single-residue resolution, which can explain the conformational transitions observed in substrates upon γ-secretase binding. Using machine learning, we predict the entire human γ-secretase substrate scope, revealing numerous previously unknown substrates. Our approach outperforms state-of-the-art protein language models, improving prediction accuracy from 60% to 90%, and achieves an 88% success rate in experimental validation. Building on these advancements, we identify pathways and diseases not linked before to γ-secretase. Generally, CPP decodes physicochemical signatures-a concept that extends beyond sequence motifs. We anticipate that our approach will be broadly applicable to diverse molecular recognition processes.
蛋白酶通过解读序列信息来识别底物,这是一个至关重要的细胞过程,而当识别基序缺失时该过程就难以捉摸。在这里,我们通过开发比较物理化学分析(CPP)来解决与阿尔茨海默病和癌症相关的膜内裂解蛋白酶γ-分泌酶的这一问题,CPP是一种基于序列的算法,用于识别可解释的物理化学特征。我们表明,CPP能以单残基分辨率解读γ-分泌酶底物特征,这可以解释在γ-分泌酶结合后底物中观察到的构象转变。利用机器学习,我们预测了整个人类γ-分泌酶底物范围,揭示了许多以前未知的底物。我们的方法优于当前最先进的蛋白质语言模型,将预测准确率从60%提高到90%,并在实验验证中取得了88%的成功率。基于这些进展,我们确定了以前与γ-分泌酶无关的途径和疾病。一般来说,CPP能解读物理化学特征,这一概念超越了序列基序。我们预计我们的方法将广泛适用于各种分子识别过程。