Graduate Program in Structural Biology, Biochemistry and Biophysics, Syracuse University, Syracuse, New York, United States of America.
PLoS One. 2011;6(5):e19395. doi: 10.1371/journal.pone.0019395. Epub 2011 May 19.
Aptamers are oligonucleotides that bind proteins and other targets with high affinity and selectivity. Twenty years ago elements of natural selection were adapted to in vitro selection in order to distinguish aptamers among randomized sequence libraries. The primary bottleneck in traditional aptamer discovery is multiple cycles of in vitro evolution.
METHODOLOGY/PRINCIPAL FINDINGS: We show that over-representation of sequences in aptamer libraries and deep sequencing enables acyclic identification of aptamers. We demonstrated this by isolating a known family of aptamers for human α-thrombin. Aptamers were found within a library containing an average of 56,000 copies of each possible randomized 15mer segment. The high affinity sequences were counted many times above the background in 2-6 million reads. Clustering analysis of sequences with more than 10 counts distinguished two sequence motifs with candidates at high abundance. Motif I contained the previously observed consensus 15mer, Thb1 (46,000 counts), and related variants with mostly G/T substitutions; secondary analysis showed that affinity for thrombin correlated with abundance (K(d) = 12 nM for Thb1). The signal-to-noise ratio for this experiment was roughly 10,000∶1 for Thb1. Motif II was unrelated to Thb1 with the leading candidate (29,000 counts) being a novel aptamer against hexose sugars in the storage and elution buffers for Concanavilin A (K(d) = 0.5 µM for α-methyl-mannoside); ConA was used to immobilize α-thrombin.
CONCLUSIONS/SIGNIFICANCE: Over-representation together with deep sequencing can dramatically shorten the discovery process, distinguish aptamers having a wide range of affinity for the target, allow an exhaustive search of the sequence space within a simplified library, reduce the quantity of the target required, eliminate cycling artifacts, and should allow multiplexing of sequencing experiments and targets.
适体是与蛋白质和其他靶标具有高亲和力和选择性结合的寡核苷酸。二十年前,人们将自然选择的元素改编为体外选择,以便在随机序列文库中区分适体。传统适体发现的主要瓶颈是多次体外进化循环。
方法/主要发现:我们表明,文库中序列的过度表达和深度测序可以实现无环适体的识别。我们通过分离已知的人α-凝血酶适体家族证明了这一点。在一个文库中发现了适体,该文库平均包含每个可能的随机 15 mer 片段的 56,000 个拷贝。在 2-600 万个读数中,高亲和力序列的计数多次高于背景。对计数超过 10 次的序列进行聚类分析,区分了两个具有高丰度候选物的序列基序。基序 I 包含先前观察到的共识 15mer、Thb1(46,000 个计数)和主要 G/T 取代的相关变体;二次分析表明,与凝血酶的亲和力与丰度相关(Thb1 的 K(d)为 12 nM)。该实验的信噪比约为 10,000∶1 用于 Thb1。基序 II 与 Thb1 无关,主要候选物(29,000 个计数)是针对 Concanavilin A 储存和洗脱缓冲液中六糖的新型适体(α-甲基甘露糖苷的 K(d)为 0.5 µM);ConA 用于固定 α-凝血酶。
结论/意义:过度表达加上深度测序可以大大缩短发现过程,区分对靶标具有广泛亲和力的适体,允许在简化文库中对序列空间进行详尽搜索,减少所需靶标的数量,消除循环伪影,并应允许测序实验和靶标的多重化。