Hangzhou Institute of Medicine, Chinese Academy of Sciences, Hangzhou, 310018 Zhejiang, China.
College of Electrical and Information Engineering, Hunan University, Changsha, 410082 Hunan, China.
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae517.
Aptamers are single-stranded nucleic acid ligands, featuring high affinity and specificity to target molecules. Traditionally they are identified from large DNA/RNA libraries using $in vitro$ methods, like Systematic Evolution of Ligands by Exponential Enrichment (SELEX). However, these libraries capture only a small fraction of theoretical sequence space, and various aptamer candidates are constrained by actual sequencing capabilities from the experiment. Addressing this, we proposed AptaDiff, the first in silico aptamer design and optimization method based on the diffusion model. Our Aptadiff can generate aptamers beyond the constraints of high-throughput sequencing data, leveraging motif-dependent latent embeddings from variational autoencoder, and can optimize aptamers by affinity-guided aptamer generation according to Bayesian optimization. Comparative evaluations revealed AptaDiff's superiority over existing aptamer generation methods in terms of quality and fidelity across four high-throughput screening data targeting distinct proteins. Moreover, surface plasmon resonance experiments were conducted to validate the binding affinity of aptamers generated through Bayesian optimization for two target proteins. The results unveiled a significant boost of $87.9%$ and $60.2%$ in RU values, along with a 3.6-fold and 2.4-fold decrease in KD values for the respective target proteins. Notably, the optimized aptamers demonstrated superior binding affinity compared to top experimental candidates selected through SELEX, underscoring the promising outcomes of our AptaDiff in accelerating the discovery of superior aptamers.
适体是单链核酸配体,具有与靶分子高亲和力和特异性。传统上,它们是使用体外方法(如指数富集的配体系统进化,SELEX)从大型 DNA/RNA 文库中鉴定出来的。然而,这些文库仅捕获了理论序列空间的一小部分,并且各种适体候选物受到实验实际测序能力的限制。为了解决这个问题,我们提出了 AptaDiff,这是第一个基于扩散模型的计算适体设计和优化方法。我们的 Aptadiff 可以生成超越高通量测序数据限制的适体,利用变分自编码器的基于模体的潜在嵌入,并可以通过根据贝叶斯优化的亲和力引导适体生成来优化适体。四项针对不同蛋白质的高通量筛选数据的比较评估显示,AptaDiff 在质量和保真度方面优于现有的适体生成方法。此外,进行了表面等离子体共振实验来验证通过贝叶斯优化生成的针对两个靶蛋白的适体的结合亲和力。结果表明,RU 值分别提高了$87.9%$和$60.2%$,KD 值分别降低了$3.6$倍和$2.4$倍。值得注意的是,与通过 SELEX 选择的最佳实验候选物相比,优化后的适体表现出更好的结合亲和力,这突显了我们的 AptaDiff 在加速发现更好的适体方面的有前途的结果。