INSERM, UMR1090 TAGC, Marseille, F-13288 France, Aix-Marseille Université, UMR1090 TAGC, Marseille, F-13288 France.
Cancer Research UK, London Research Institute, London WC2A 3LY, UK.
Bioinformatics. 2016 Apr 1;32(7):1091-3. doi: 10.1093/bioinformatics/btv705. Epub 2015 Dec 1.
Supervised classification based on support vector machines (SVMs) has successfully been used for the prediction of cis-regulatory modules (CRMs). However, no integrated tool using such heterogeneous data as position-specific scoring matrices, ChIP-seq data or conservation scores is currently available. Here, we present LedPred, a flexible SVM workflow that predicts new regulatory sequences based on the annotation of known CRMs, which are associated to a large variety of feature types. LedPred is provided as an R/Bioconductor package connected to an online server to avoid installation of non-R software. Due to the heterogeneous CRM feature integration, LedPred excels at the prediction of regulatory sequences in Drosophila and mouse datasets compared with similar SVM-based software.
LedPred is available on GitHub: https://github.com/aitgon/LedPred and Bioconductor: http://bioconductor.org/packages/release/bioc/html/LedPred.html under the MIT license.
Supplementary data are available at Bioinformatics online.
基于支持向量机(SVM)的监督分类已成功用于顺式调控模块(CRMs)的预测。然而,目前尚无可用于整合位置特异性评分矩阵、ChIP-seq 数据或保守分数等多种异构数据的综合工具。在这里,我们提出了 LedPred,这是一种灵活的 SVM 工作流程,它基于已知 CRM 的注释来预测新的调控序列,这些 CRM 与各种特征类型相关联。LedPred 作为一个 R/Bioconductor 包提供,并连接到一个在线服务器,以避免安装非 R 软件。由于 CRM 特征的异构整合,与基于类似 SVM 的软件相比,LedPred 在预测果蝇和小鼠数据集的调控序列方面表现出色。
LedPred 可在 GitHub 上获得:https://github.com/aitgon/LedPred 和 Bioconductor:http://bioconductor.org/packages/release/bioc/html/LedPred.html,根据 MIT 许可证。
补充数据可在 Bioinformatics 在线获得。