School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao 066004, PR China.
School of Mathematics and Information Science & Technology, Hebei Normal University of Science & Technology, Qinhuangdao 066004, PR China.
Genomics. 2019 May;111(3):457-464. doi: 10.1016/j.ygeno.2018.03.003. Epub 2018 Mar 13.
Recombination spot identification plays an important role in revealing genome evolution and developing DNA function study. Although some computational methods have been proposed, extracting discriminatory information embedded in DNA properties has not received enough attention. The DNA properties include dinucleotide flexibility, structure and thermodynamic parameter, which are significant for genome evolution research. To explore the potential effect of DNA properties, a novel feature extraction method, called iRSpot-PDI, is proposed. A wrapper feature selection method with the best first search is used to identify the best feature set. To verify the effectiveness of the proposed method, support vector machine is employed on the obtained features. Prediction results are reported on two benchmark datasets. Compared with the recently reported methods, iRSpot-PDI achieves the highest values of individual specificity, Matthew's correlation coefficient and overall accuracy. The experimental results confirm that iRSpot-PDI is effective for accurate identification of recombination spots. The datasets can be downloaded from the following URL: http://stxy.neuq.edu.cn/info/1095/1157.htm.
重组热点识别在揭示基因组进化和开发 DNA 功能研究方面发挥着重要作用。虽然已经提出了一些计算方法,但提取 DNA 特性中隐含的鉴别信息尚未得到足够的重视。DNA 特性包括二核苷酸的柔韧性、结构和热力学参数,这些对于基因组进化研究具有重要意义。为了探索 DNA 特性的潜在影响,提出了一种新的特征提取方法,称为 iRSpot-PDI。采用最佳优先搜索的包装式特征选择方法来识别最佳特征集。为了验证所提出方法的有效性,在获得的特征上使用支持向量机进行预测。在两个基准数据集上报告了预测结果。与最近报道的方法相比,iRSpot-PDI 实现了个体特异性、马修相关系数和整体准确性的最高值。实验结果证实,iRSpot-PDI 可有效准确地识别重组热点。数据集可从以下网址下载:http://stxy.neuq.edu.cn/info/1095/1157.htm。