Suppr超能文献

pDHS-SVM:一种基于支持向量机的植物DNase I超敏感位点预测方法。

pDHS-SVM: A prediction method for plant DNase I hypersensitive sites based on support vector machine.

作者信息

Zhang Shanxin, Zhou Zhiping, Chen Xinmeng, Hu Yong, Yang Lindong

机构信息

Engineering Research Center of IoT Technology Applications (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China.

Engineering Research Center of IoT Technology Applications (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China.

出版信息

J Theor Biol. 2017 Aug 7;426:126-133. doi: 10.1016/j.jtbi.2017.05.030. Epub 2017 May 26.

Abstract

DNase I hypersensitive sites (DHSs) are accessible chromatin regions hypersensitive to cleavages by DNase I endonucleases. DHSs are indicative of cis-regulatory DNA elements (CREs), all of which play important roles in global gene expression regulation. It is helpful for discovering CREs by recognition of DHSs in genome. To accelerate the investigation, it is an important complement to develop cost-effective computational methods to identify DHSs. However, there is a lack of tools used for identifying DHSs in plant genome. Here we presented pDHS-SVM, a computational predictor to identify plant DHSs. To integrate the global sequence-order information and local DNA properties, reverse complement kmer and dinucleotide-based auto covariance of DNA sequences were applied to construct the feature space. In this work, fifteen physical-chemical properties of dinucleotides were used and Support Vector Machine (SVM) was employed. To further improve the performance of the predictor and extract an optimized subset of nucleotide physical-chemical properties positive for the DHSs, a heuristic nucleotide physical-chemical property selection algorithm was introduced. With the optimized subset of properties, experimental results of Arabidopsis thaliana and rice (Oryza sativa) showed that pDHS-SVM could achieve accuracies up to 87.00%, and 85.79%, respectively. The results indicated the effectiveness of proposed method for predicting DHSs. Furthermore, pDHS-SVM could provide a helpful complement for predicting CREs in plant genome. Our implementation of the novel proposed method pDHS-SVM is freely available as source code, at https://github.com/shanxinzhang/pDHS-SVM.

摘要

脱氧核糖核酸酶I超敏位点(DHSs)是对脱氧核糖核酸酶I内切酶切割敏感的可及染色质区域。DHSs指示顺式调控DNA元件(CREs),所有这些元件在全局基因表达调控中都发挥着重要作用。通过识别基因组中的DHSs有助于发现CREs。为了加速这一研究,开发经济高效的计算方法来识别DHSs是一项重要补充。然而,缺乏用于识别植物基因组中DHSs的工具。在此,我们提出了pDHS-SVM,一种用于识别植物DHSs的计算预测器。为了整合全局序列顺序信息和局部DNA特性,应用反向互补kmer和基于二核苷酸的DNA序列自协方差来构建特征空间。在这项工作中,使用了二核苷酸的15种物理化学性质,并采用了支持向量机(SVM)。为了进一步提高预测器的性能并提取对DHSs呈阳性的核苷酸物理化学性质的优化子集,引入了一种启发式核苷酸物理化学性质选择算法。利用优化后的性质子集,拟南芥和水稻的实验结果表明,pDHS-SVM的准确率分别可达87.00%和85.79%。结果表明了所提方法预测DHSs的有效性。此外,pDHS-SVM可为预测植物基因组中的CREs提供有益补充。我们实现的新方法pDHS-SVM的源代码可在https://github.com/shanxinzhang/pDHS-SVM上免费获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验