Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, 301 Beard Hall, University of North Carolina, Chapel Hill, NC 27599, United States.
Glycan Therapeutics, 617 Hutton Street, Raleigh, NC 27606, United States.
Glycobiology. 2024 May 26;34(7). doi: 10.1093/glycob/cwae039.
Heparan sulfate (HS), a sulfated polysaccharide abundant in the extracellular matrix, plays pivotal roles in various physiological and pathological processes by interacting with proteins. Investigating the binding selectivity of HS oligosaccharides to target proteins is essential, but the exhaustive inclusion of all possible oligosaccharides in microarray experiments is impractical. To address this challenge, we present a hybrid pipeline that integrates microarray and in silico techniques to design oligosaccharides with desired protein affinity. Using fibroblast growth factor 2 (FGF2) as a model protein, we assembled an in-house dataset of HS oligosaccharides on microarrays and developed two structural representations: a standard representation with all atoms explicit and a simplified representation with disaccharide units as "quasi-atoms." Predictive Quantitative Structure-Activity Relationship (QSAR) models for FGF2 affinity were developed using the Random Forest (RF) algorithm. The resulting models, considering the applicability domain, demonstrated high predictivity, with a correct classification rate of 0.81-0.80 and improved positive predictive values (PPV) up to 0.95. Virtual screening of 40 new oligosaccharides using the simplified model identified 15 computational hits, 11 of which were experimentally validated for high FGF2 affinity. This hybrid approach marks a significant step toward the targeted design of oligosaccharides with desired protein interactions, providing a foundation for broader applications in glycobiology.
硫酸乙酰肝素 (HS) 是细胞外基质中丰富的一种硫酸多糖,通过与蛋白质相互作用,在各种生理和病理过程中发挥关键作用。研究 HS 寡糖与靶蛋白的结合选择性至关重要,但在微阵列实验中穷尽包含所有可能的寡糖是不切实际的。为了解决这一挑战,我们提出了一种结合微阵列和计算技术的混合策略,用于设计具有所需蛋白质亲和力的寡糖。我们以成纤维细胞生长因子 2 (FGF2) 作为模型蛋白,在微阵列上组装了 HS 寡糖的内部数据集,并开发了两种结构表示:一种是具有所有原子显式的标准表示,另一种是具有二糖单元作为“拟原子”的简化表示。使用随机森林 (RF) 算法为 FGF2 亲和力开发了预测性定量构效关系 (QSAR) 模型。考虑到适用范围,这些模型表现出很高的预测能力,正确分类率为 0.81-0.80,阳性预测值 (PPV) 提高到 0.95。使用简化模型对 40 种新寡糖进行虚拟筛选,确定了 15 个计算命中,其中 11 个经实验验证具有高 FGF2 亲和力。这种混合方法标志着朝着具有所需蛋白质相互作用的寡糖的靶向设计迈出了重要一步,为糖生物学中的更广泛应用提供了基础。