Suppr超能文献

iDHS-DT:通过整合DNA二核苷酸和三核苷酸信息识别DNase I超敏位点。

iDHS-DT: Identifying DNase I hypersensitive sites by integrating DNA dinucleotide and trinucleotide information.

作者信息

Zou Hongliang, Yang Fan, Yin Zhijian

机构信息

School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330003, China.

School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330003, China.

出版信息

Biophys Chem. 2022 Feb;281:106717. doi: 10.1016/j.bpc.2021.106717. Epub 2021 Nov 14.

Abstract

DNase I hypersensitive sites (DHSs) is important for identifying the location of gene regulatory elements, such as promoters, enhancers, silencers, and so on. Thus, it is crucial for discriminating DHSs from non-DHSs. Although some traditional methods, such as Southern blots and DNase-seq technique, have the ability to identify DHSs, these approaches are time-consuming, laborious, and expensive. To address these issues, researchers paid their attention on computational approaches. Therefore, in this study, we developed a novel predictor called iDHS-DT to identify DHSs. In this predictor, the DNA sequences were firstly denoted by physicochemical properties (PC) of DNA dinucleotide and trinucleotide. Then, three different descriptors, including auto-covariance, cross-covariance, and discrete wavelet transform were used to collect related features from the PC matrix. Next, the least absolute shrinkage and selection operator (LASSO) algorithm was employed to remove these irrelevant and redundant features. Finally, these selected features were fed into support vector machine (SVM) for distinguishing DHSs from non-DHSs. The proposed method achieved 97.64% and 98.22% classification accuracy on dataset S and S, respectively. Compared with the existing predictors, our proposed model has significantly improvement in classification performance. Experimental results demonstrated that the proposed method is powerful in identifying DHSs.

摘要

脱氧核糖核酸酶I超敏位点(DHSs)对于识别基因调控元件的位置很重要,如启动子、增强子、沉默子等。因此,区分DHSs和非DHSs至关重要。尽管一些传统方法,如Southern印迹法和DNase-seq技术,有能力识别DHSs,但这些方法耗时、费力且昂贵。为了解决这些问题,研究人员将注意力转向了计算方法。因此,在本研究中,我们开发了一种名为iDHS-DT的新型预测器来识别DHSs。在这个预测器中,DNA序列首先由DNA二核苷酸和三核苷酸的物理化学性质(PC)表示。然后,使用三种不同的描述符,包括自协方差、互协方差和离散小波变换,从PC矩阵中收集相关特征。接下来,采用最小绝对收缩和选择算子(LASSO)算法去除这些不相关和冗余的特征。最后,将这些选定的特征输入支持向量机(SVM)以区分DHSs和非DHSs。所提出的方法在数据集S和S上分别达到了97.64%和98.22%的分类准确率。与现有的预测器相比,我们提出的模型在分类性能上有显著提高。实验结果表明,所提出的方法在识别DHSs方面很强大。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验