School of Mathematics and Statistics, Xidian University, Xi'an, 710071, PR China.
School of Electronic Engineering, Xidian University, Xi'an, 710071, PR China.
Genomics. 2019 Dec;111(6):1760-1770. doi: 10.1016/j.ygeno.2018.11.031. Epub 2018 Dec 6.
Meiotic recombination plays an important role in the process of genetic evolution. Previous researches have shown that the recombination rates provide important information about the mechanism of recombination study. However, at present, most methods ignore the hidden correlation and spatial autocorrelation of the DNA sequence. In this study, we proposed a predictor called iRSpot-DTS to identify hot/cold spots based on the benchmark datasets. We proposed a feature extraction method called dinucleotide-based spatial autocorrelation(DSA) which can incorporate the original DNA properties and spatial information of DNA sequence. Then it used t-SNE method to remove the noise which outperformed PCA. Finally, we used SAE softmax classifier to do classification which is based on networks and can get more hidden information of DNA sequence, our iRSpot-DTS achieved remarkable performance. Jackknife cross validation tests were done on two benchmark datasets. We achieved state-of-the-art results with 96.61% overall accuracy(OA), 93.16% Matthews correlation coefficient (MCC) and over 95% in Sn and Sp which are the best in this state.
减数分裂重组在遗传进化过程中起着重要作用。先前的研究表明,重组率提供了关于重组研究机制的重要信息。然而,目前大多数方法都忽略了 DNA 序列的隐藏相关性和空间自相关性。在这项研究中,我们提出了一种称为 iRSpot-DTS 的预测器,该预测器可以基于基准数据集识别热点/冷点。我们提出了一种称为基于二核苷酸的空间自相关(DSA)的特征提取方法,该方法可以结合原始 DNA 性质和 DNA 序列的空间信息。然后,它使用 t-SNE 方法去除噪声,优于 PCA。最后,我们使用 SAE softmax 分类器基于网络进行分类,这可以获取 DNA 序列的更多隐藏信息,我们的 iRSpot-DTS 实现了卓越的性能。在两个基准数据集上进行了 Jackknife 交叉验证测试。我们以 96.61%的整体准确率(OA)、93.16%的马修斯相关系数(MCC)和超过 95%的 Sn 和 Sp 实现了最先进的结果,这在该领域中是最好的。