Suppr超能文献

基于综合DNA序列特征的支持向量机方法鉴定核小体定位

[Identification of nucleosome positioning using support vector machine method based on comprehensive DNA sequence feature].

作者信息

Cui Ying, Xu Zelong, Li Jianzhong

机构信息

Electronic Engineering College, Heilongjiang University, Harbin 150080, P.R.China;School of Bioinformatics Sciences and Technology, Harbin Medical University, Harbin 150081, P.R.China.

School of Bioinformatics Sciences and Technology, Harbin Medical University, Harbin 150081, P.R.China.

出版信息

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2020 Jun 25;37(3):496-501. doi: 10.7507/1001-5515.201911064.

Abstract

In this article, based on z-curve theory and position weight matrix (PWM), a model for nucleosome sequences was constructed. Nucleosome sequence dataset was transformed into three-dimensional coordinates, PWM of the nucleosome sequences was calculated and the similarity score was obtained. After integrating them, a nucleosome feature model based on the comprehensive DNA sequences was obtained and named CSeqFM. We calculated the Euclidean distance between nucleosome sequence candidates or linker sequences and CSeqFM model as the feature dataset, and put the feature datasets into the support vector machine (SVM) for training and testing by ten-fold cross-validation. The results showed that the sensitivity, specificity, accuracy and Matthews correlation coefficient (MCC) of identifying nucleosome positioning for were 97.1%, 96.9%, 94.2% and 0.89, respectively, and the area under the receiver operating characteristic curve (AUC) was 0.980 1. Compared with another z-curve method, it was found that our method had better identifying effect and each evaluation performance showed better superiority. CSeqFM method was applied to identify nucleosome positioning for other three species, including , and . The results showed that AUCs of the three species were all higher than 0.90, and CSeqFM method also showed better stability and effectiveness compared with iNuc-STNC and iNuc-PseKNC methods, which is further demonstrated that CSeqFM method has strong reliability and good identification performance.

摘要

在本文中,基于z曲线理论和位置权重矩阵(PWM)构建了一个核小体序列模型。将核小体序列数据集转换为三维坐标,计算核小体序列的PWM并获得相似性得分。整合这些后,得到了一个基于综合DNA序列的核小体特征模型,并将其命名为CSeqFM。我们计算了核小体序列候选物或连接子序列与CSeqFM模型之间的欧氏距离作为特征数据集,并将特征数据集放入支持向量机(SVM)中通过十折交叉验证进行训练和测试。结果表明,识别核小体定位的灵敏度、特异性、准确率和马修斯相关系数(MCC)分别为97.1%、96.9%、94.2%和0.89,受试者工作特征曲线(AUC)下的面积为0.980 1。与另一种z曲线方法相比,发现我们的方法具有更好的识别效果,各项评估性能均表现出更好的优越性。将CSeqFM方法应用于另外三个物种(包括 、 和 )的核小体定位识别。结果表明,这三个物种的AUC均高于0.90,并且与iNuc-STNC和iNuc-PseKNC方法相比,CSeqFM方法也表现出更好的稳定性和有效性,这进一步证明了CSeqFM方法具有很强的可靠性和良好的识别性能。

相似文献

本文引用的文献

5
Nucleosome repositioning underlies dynamic gene expression.核小体重新定位是动态基因表达的基础。
Genes Dev. 2016 Mar 15;30(6):660-72. doi: 10.1101/gad.274910.115. Epub 2016 Mar 10.
6
Using deformation energy to analyze nucleosome positioning in genomes.利用变形能分析基因组中的核小体定位。
Genomics. 2016 Mar;107(2-3):69-75. doi: 10.1016/j.ygeno.2015.12.005. Epub 2015 Dec 24.
8
Structural basis for retroviral integration into nucleosomes.逆转录病毒整合入核小体的结构基础。
Nature. 2015 Jul 16;523(7560):366-9. doi: 10.1038/nature14495. Epub 2015 Jun 10.
9
Nucleosome positioning in yeasts: methods, maps, and mechanisms.酵母中的核小体定位:方法、图谱及机制
Chromosoma. 2015 Jun;124(2):131-51. doi: 10.1007/s00412-014-0501-x. Epub 2014 Dec 23.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验