Kappel Kalli, Strebinger Daniel, Edmonds KeHuan K, Chau-Duy-Tam Vo Samuel, Vockley Christopher M, Biswas Tridib, Farhi Samouil L, Macrae Rhiannon, Zhang Feng, Regev Aviv
Howard Hughes Medical Institute, Cambridge, MA, USA.
Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Nat Methods. 2025 Jun 16. doi: 10.1038/s41592-025-02726-y.
Biomolecular condensates organize numerous subcellular processes and have been implicated in diseases, including neurodegeneration and cancer. Protein sequences intrinsically encode their propensity to form condensates, but specific sequence features that regulate this behavior have not been systematically explored at scale. Here, we develop CondenSeq, a high-throughput pooled imaging with in situ sequencing approach to measure propensities of thousands of protein sequences to form nuclear condensates. Leveraging the large scale of these experiments, we evaluated the impacts of dozens of sequence features across a wide range of sequence contexts, identifying several features with highly consistent, context-independent effects and others with less-consistent effects. We also identified multiple classes of condensates and discovered distinct sequence properties that drive their formation. Our results provide a systematic overview of the relationships between protein sequences and nuclear condensate formation and establish a general approach for further dissecting these relationships at scale.
生物分子凝聚物组织众多亚细胞过程,并与包括神经退行性疾病和癌症在内的疾病有关。蛋白质序列内在地编码了它们形成凝聚物的倾向,但调节这种行为的特定序列特征尚未在大规模上进行系统探索。在这里,我们开发了CondenSeq,这是一种高通量的原位测序池成像方法,用于测量数千个蛋白质序列形成核凝聚物的倾向。利用这些实验的大规模数据,我们评估了数十种序列特征在广泛的序列背景下的影响,确定了几个具有高度一致、不依赖背景效应的特征,以及其他效应不太一致的特征。我们还确定了多类凝聚物,并发现了驱动它们形成的不同序列特性。我们的结果提供了蛋白质序列与核凝聚物形成之间关系的系统概述,并建立了一种在大规模上进一步剖析这些关系的通用方法。