Suppr超能文献

使用支持向量机预测DNA序列中的甲基化CpG位点

Prediction of methylated CpGs in DNA sequences using a support vector machine.

作者信息

Bhasin Manoj, Zhang Hong, Reinherz Ellis L, Reche Pedro A

机构信息

Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.

出版信息

FEBS Lett. 2005 Aug 15;579(20):4302-8. doi: 10.1016/j.febslet.2005.07.002.

Abstract

DNA methylation plays a key role in the regulation of gene expression. The most common type of DNA modification consists of the methylation of cytosine in the CpG dinucleotide. At the present time, there is no method available for the prediction of DNA methylation sites. Therefore, in this study we have developed a support vector machine (SVM)-based method for the prediction of cytosine methylation in CpG dinucleotides. Initially a SVM module was developed from human data for the prediction of human-specific methylation sites. This module achieved a MCC and AUC of 0.501 and 0.814, respectively, when evaluated using a 5-fold cross-validation. The performance of this SVM-based module was better than the classifiers built using alternative machine learning and statistical algorithms including artificial neural networks, Bayesian statistics, and decision trees. Additional SVM modules were also developed based on mammalian- and vertebrate-specific methylation patterns. The SVM module based on human methylation patterns was used for genome-wide analysis of methylation sites. This analysis demonstrated that the percentage of methylated CpGs is higher in UTRs as compared to exonic and intronic regions of human genes. This method is available on line for public use under the name of Methylator at http://bio.dfci.harvard.edu/Methylator/.

摘要

DNA甲基化在基因表达调控中起关键作用。最常见的DNA修饰类型是CpG二核苷酸中胞嘧啶的甲基化。目前,尚无预测DNA甲基化位点的方法。因此,在本研究中,我们开发了一种基于支持向量机(SVM)的方法来预测CpG二核苷酸中的胞嘧啶甲基化。最初,从人类数据开发了一个SVM模块,用于预测人类特异性甲基化位点。当使用五折交叉验证进行评估时,该模块的马修斯相关系数(MCC)和曲线下面积(AUC)分别为0.501和0.814。基于SVM的该模块的性能优于使用包括人工神经网络、贝叶斯统计和决策树在内的其他机器学习和统计算法构建的分类器。还基于哺乳动物和脊椎动物特异性甲基化模式开发了其他SVM模块。基于人类甲基化模式的SVM模块用于甲基化位点的全基因组分析。该分析表明,与人类基因的外显子和内含子区域相比,非翻译区(UTR)中甲基化的CpG百分比更高。该方法以Methylator的名称在http://bio.dfci.harvard.edu/Methylator/ 在线供公众使用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验