School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China.
School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong 518055, China.
Biomed Res Int. 2016;2016:8527435. doi: 10.1155/2016/8527435. Epub 2016 Aug 25.
Recombination presents a nonuniform distribution across the genome. Genomic regions that present relatively higher frequencies of recombination are called hotspots while those with relatively lower frequencies of recombination are recombination coldspots. Therefore, the identification of hotspots/coldspots could provide useful information for the study of the mechanism of recombination. In this study, a new computational predictor called SVM-EL was proposed to identify hotspots/coldspots across the yeast genome. It combined Support Vector Machines (SVMs) and Ensemble Learning (EL) based on three features including basic kmer (Kmer), dinucleotide-based auto-cross covariance (DACC), and pseudo dinucleotide composition (PseDNC). These features are able to incorporate the nucleic acid composition and their order information into the predictor. The proposed SVM-EL achieves an accuracy of 82.89% on a widely used benchmark dataset, which outperforms some related methods.
重组在基因组中呈现不均匀的分布。那些呈现相对较高重组频率的基因组区域被称为热点,而那些呈现相对较低重组频率的区域则被称为重组冷点。因此,热点/冷点的识别可以为重组机制的研究提供有用的信息。在这项研究中,提出了一种新的计算预测器,称为 SVM-EL,用于识别酵母基因组中的热点/冷点。它结合了支持向量机(SVMs)和基于集成学习(EL)的方法,基于三个特征,包括基本 kmer(Kmer)、基于二核苷酸的自交叉协方差(DACC)和伪二核苷酸组成(PseDNC)。这些特征能够将核酸组成及其顺序信息纳入预测器中。所提出的 SVM-EL 在一个广泛使用的基准数据集上实现了 82.89%的准确率,优于一些相关方法。