Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, Bethesda, MD 20892, USA.
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad485.
DNA methylation profiling is a useful tool to increase the accuracy of a cancer diagnosis. However, a comprehensive R package specially for it is lacking. Hence, we developed the R package methylClass for methylation-based classification. Within it, we provide the eSVM (ensemble-based support vector machine) model to achieve much higher accuracy in methylation data classification than the popular random forest model and overcome the time-consuming problem of the traditional SVM. In addition, some novel feature selection methods are included in the package to improve the classification. Furthermore, because methylation data can be converted to other omics, such as copy number variation data, we also provide functions for multi-omics studies. The testing of this package on four datasets shows the accurate performance of our package, especially eSVM, which can be used in both methylation and multi-omics models and outperforms other methods in both cases. methylClass is available at: https://github.com/yuabrahamliu/methylClass.
DNA 甲基化分析是提高癌症诊断准确性的有用工具。然而,专门针对它的综合 R 包却缺乏。因此,我们开发了基于甲基化的分类的 R 包 methylClass。在这个包中,我们提供了 eSVM(基于集成的支持向量机)模型,与流行的随机森林模型相比,它可以在甲基化数据分类中实现更高的准确性,并克服了传统 SVM 的耗时问题。此外,该包中还包含了一些新颖的特征选择方法来提高分类的效果。此外,由于甲基化数据可以转换为其他组学数据,如拷贝数变异数据,我们还提供了用于多组学研究的功能。该包在四个数据集上的测试表明了我们的包的准确性能,特别是 eSVM,它可以在甲基化和多组学模型中使用,并且在这两种情况下都优于其他方法。methylClass 可在:https://github.com/yuabrahamliu/methylClass 获得。