Yu Xinliang, Yu Yixiong, Zeng Qun
College of Chemistry and Chemical Engineering, Hunan Institute of Engineering, Xiangtan, Hunan, China; State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Hunan University, Changsha, Hunan, China.
G1302, Lushan International Experimental School, Changsha, Hunan, China.
PLoS One. 2014 Jun 13;9(6):e99964. doi: 10.1371/journal.pone.0099964. eCollection 2014.
Synthesizing and characterizing aptamers with high affinity and specificity have been extensively carried out for analytical and biomedical applications. Few publications can be found that describe structure-activity relationships (SARs) of candidate aptamer sequences.
This paper reports pattern recognition with support vector machine (SVM) classification techniques for the identification of streptavidin-binding aptamers as "low" or "high" affinity aptamers. The SVM parameters C and γ were optimized using genetic algorithms. Four descriptors, the topological descriptor PW4 (path/walk 4--Randic shape index), the connectivity index X3A (average connectivity index chi-3), the topological charge index JGI2 (mean topological charge index of order 2), and the free energy E of the secondary structure, were used to describe the structures of candidate aptamer sequences from SELEX selection (Schütze et al. (2011) PLoS ONE (12):e29604).
The predicted fractions of winning streptavidin-binding aptamers for ten rounds of SELEX conform to the aptamer evolutionary principles of SELEX-based screening. The feasibility of applying pattern recognition based on SVM and genetic algorithms for streptavidin-binding aptamers has been demonstrated.
为了分析和生物医学应用,已经广泛开展了具有高亲和力和特异性的适体的合成与表征工作。很少有出版物描述候选适体序列的构效关系(SARs)。
本文报道了使用支持向量机(SVM)分类技术进行模式识别,以将链霉亲和素结合适体鉴定为“低”或“高”亲和力适体。使用遗传算法对SVM参数C和γ进行了优化。四个描述符,即拓扑描述符PW4(路径/游走4 - 兰迪奇形状指数)、连接性指数X3A(平均连接性指数chi - 3)、拓扑电荷指数JGI2(二阶平均拓扑电荷指数)以及二级结构的自由能E,用于描述从SELEX筛选(Schütze等人,(2011年)《公共科学图书馆·综合》(12):e29604)中获得的候选适体序列的结构。
十轮SELEX筛选中预测的获胜链霉亲和素结合适体的比例符合基于SELEX筛选的适体进化原理。已经证明了将基于支持向量机和遗传算法的模式识别应用于链霉亲和素结合适体的可行性。