Suppr超能文献

增强子预测2.0:基于位置特异性三核苷酸倾向和电子-离子相互作用势特征选择预测增强子及其强度。

EnhancerPred2.0: predicting enhancers and their strength based on position-specific trinucleotide propensity and electron-ion interaction potential feature selection.

作者信息

He Wenying, Jia Cangzhi

机构信息

Department of Mathematics, Dalian Maritime University, No. 1 Linghai Road, Dalian 116026, China.

出版信息

Mol Biosyst. 2017 Mar 28;13(4):767-774. doi: 10.1039/c7mb00054e.

Abstract

Enhancers are cis-acting elements that play major roles in upregulating eukaryotic gene expression by providing binding sites for transcription factors and their complexes. Because enhancers are highly cell/tissue specific, lack common motifs, and are far from the target gene, the systematic and precise identification of enhancer regions in DNA sequences is a big challenge. In this study, we developed an enhancer prediction method called EnhancerPred2.0 by combining position-specific trinucleotide propensity (PSTNP) information with the electron-ion interaction potential (EIIP) values for trinucleotides, to predict enhancers and their subgroups. To obtain the optimal combination of features, F-score values were used in a two-step wrapper-based feature selection method, which was applied in a high dimensional feature vector from PSTNP and EIIP. Finally, 126 optimized features from PSTNP combined with 32 optimized features from EIIP yielded the best performance for identifying enhancers and non-enhancers, with an overall accuracy (Acc) of 88.27% and a Matthews correlation coefficient (MCC) of 0.77. Additionally, 198 features from PSTNP combined with 37 features from EIIP yielded the best performance for identifying strong and weak enhancers, with an overall Acc of 98.05% and a MCC of 0.96. Rigorous jackknife tests indicated that EnhancerPred2.0 was significantly better than the existing enhancer prediction methods in both overall accuracy and stability.

摘要

增强子是顺式作用元件,通过为转录因子及其复合物提供结合位点,在上调真核基因表达中发挥主要作用。由于增强子具有高度的细胞/组织特异性,缺乏共同基序,且距离靶基因较远,因此在DNA序列中系统而精确地识别增强子区域是一项巨大挑战。在本研究中,我们通过将位置特异性三核苷酸倾向(PSTNP)信息与三核苷酸的电子-离子相互作用势(EIIP)值相结合,开发了一种名为EnhancerPred2.0的增强子预测方法,以预测增强子及其亚组。为了获得特征的最佳组合,在基于包装器的两步特征选择方法中使用F分数值,该方法应用于来自PSTNP和EIIP的高维特征向量。最后,来自PSTNP的126个优化特征与来自EIIP的32个优化特征相结合,在识别增强子和非增强子时表现出最佳性能,总体准确率(Acc)为88.27%,马修斯相关系数(MCC)为0.77。此外,来自PSTNP的198个特征与来自EIIP的37个特征相结合,在识别强增强子和弱增强子时表现出最佳性能,总体Acc为98.05%,MCC为0.96。严格的留一法检验表明,EnhancerPred2.0在总体准确率和稳定性方面均显著优于现有的增强子预测方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验