• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于弹性网络和堆叠学习的复制起始位点识别。

iORI-ENST: identifying origin of replication sites based on elastic net and stacking learning.

机构信息

School of Mathematics and Statistics, Xidian University, Xi'an, P. R. China.

School of Science, Xi'an Polytechnic University, Xi'an, P. R. China.

出版信息

SAR QSAR Environ Res. 2021 Apr;32(4):317-331. doi: 10.1080/1062936X.2021.1895884. Epub 2021 Mar 18.

DOI:10.1080/1062936X.2021.1895884
PMID:33730950
Abstract

DNA replication is not only the basis of biological inheritance but also the most fundamental process in all living organisms. It plays a crucial role in the cell-division cycle and gene expression regulation. Hence, the accurate identification of the origin of replication sites (ORIs) has a great meaning for further understanding the regulatory mechanism of gene expression and treating genic diseases. In this paper, a novel, feasible and powerful model, namely, iORI-ENST is designed for identifying ORIs. Firstly, we extract the different features by incorporating mono-nucleotide binary encoding and dinucleotide-based spatial autocorrelation. Subsequently, elastic net is utilized as the feature selection method to select the optimal feature set. And then stacking learning is employed to predict ORIs and non-ORIs, which contains random forest, adaboost, gradient boosting decision tree, extra trees and support vector machine. Finally, the ORI sites are identified on the benchmark datasets and with their accuracies of 91.41% and 95.07%, respectively. Meanwhile, an independent dataset is employed to verify the validation and transferability of our model and its accuracy reaches 91.10%. Comparing with state-of-the-art methods, our model achieves more remarkable performance. The results show our model is a feasible, effective and powerful tool for identifying ORIs. The source code and datasets are available at https://github.com/YingyingYao/iORI-ENST.

摘要

DNA 复制不仅是生物遗传的基础,也是所有生物中最基本的过程。它在细胞分裂周期和基因表达调控中起着至关重要的作用。因此,准确识别复制起始位点(ORIs)对于进一步了解基因表达的调控机制和治疗基因疾病具有重要意义。在本文中,我们设计了一种新颖、可行且强大的模型,即 iORI-ENST,用于识别 ORIs。首先,我们通过结合单核苷酸二进制编码和基于二核苷酸的空间自相关来提取不同的特征。然后,弹性网络被用作特征选择方法来选择最优的特征集。接着,堆叠学习被用于预测 ORIs 和非 ORIs,其中包含随机森林、adaboost、梯度提升决策树、极端随机树和支持向量机。最后,在基准数据集和上,我们的模型分别达到了 91.41%和 95.07%的准确率来识别 ORI 位点。同时,我们使用了一个独立数据集来验证我们模型的验证和可转移性,其准确率达到 91.10%。与最先进的方法相比,我们的模型取得了更显著的性能。结果表明,我们的模型是一种可行、有效且强大的识别 ORIs 的工具。源代码和数据集可在 https://github.com/YingyingYao/iORI-ENST 上获取。

相似文献

1
iORI-ENST: identifying origin of replication sites based on elastic net and stacking learning.基于弹性网络和堆叠学习的复制起始位点识别。
SAR QSAR Environ Res. 2021 Apr;32(4):317-331. doi: 10.1080/1062936X.2021.1895884. Epub 2021 Mar 18.
2
Integrating LASSO Feature Selection and Soft Voting Classifier to Identify Origins of Replication Sites.整合套索特征选择与软投票分类器以识别复制起点位点
Curr Genomics. 2022 Jun 10;23(2):83-93. doi: 10.2174/1389202923666220214122506.
3
iR5hmcSC: Identifying RNA 5-hydroxymethylcytosine with multiple features based on stacking learning.iR5hmcSC:基于堆叠学习利用多种特征识别RNA 5-羟甲基胞嘧啶
Comput Biol Chem. 2021 Dec;95:107583. doi: 10.1016/j.compbiolchem.2021.107583. Epub 2021 Sep 20.
4
A computational platform to identify origins of replication sites in eukaryotes.一种用于鉴定真核生物复制起始位点的计算平台。
Brief Bioinform. 2021 Mar 22;22(2):1940-1950. doi: 10.1093/bib/bbaa017.
5
Computational prediction and interpretation of cell-specific replication origin sites from multiple eukaryotes by exploiting stacking framework.利用堆积框架从多种真核生物中计算预测和解释细胞特异性复制起始位点。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa275.
6
Identification of DNA N4-methylcytosine sites based on multi-source features and gradient boosting decision tree.基于多源特征和梯度提升决策树的 DNA N4-甲基胞嘧啶位点鉴定。
Anal Biochem. 2022 Sep 1;652:114746. doi: 10.1016/j.ab.2022.114746. Epub 2022 May 21.
7
ORI-Deep: improving the accuracy for predicting origin of replication sites by using a blend of features and long short-term memory network.ORI-Deep:通过混合使用特征和长短期记忆网络来提高复制起始位点预测的准确性。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac001.
8
iDHS-DASTS: identifying DNase I hypersensitive sites based on LASSO and stacking learning.iDHS-DASTS:基于 LASSO 和堆叠学习的 DNase I 超敏位点识别。
Mol Omics. 2021 Feb 22;17(1):130-141. doi: 10.1039/d0mo00115e.
9
ORI-Explorer: a unified cell-specific tool for origin of replication sites prediction by feature fusion.ORI-Explorer:通过特征融合进行复制起始位点预测的统一细胞特异性工具。
Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad664.
10
Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique.使用两步特征选择技术鉴定酿酒酵母中的复制原点。
Bioinformatics. 2019 Jun 1;35(12):2075-2083. doi: 10.1093/bioinformatics/bty943.

引用本文的文献

1
Optimization of sports effect evaluation technology from random forest algorithm and elastic network algorithm.从随机森林算法和弹性网络算法优化运动效果评估技术。
PLoS One. 2023 Oct 20;18(10):e0292557. doi: 10.1371/journal.pone.0292557. eCollection 2023.
2
Integrating LASSO Feature Selection and Soft Voting Classifier to Identify Origins of Replication Sites.整合套索特征选择与软投票分类器以识别复制起点位点
Curr Genomics. 2022 Jun 10;23(2):83-93. doi: 10.2174/1389202923666220214122506.
3
Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning.
通过集成深度学习预测哺乳动物多种组织中的 N6-甲基腺苷位点。
Int J Mol Sci. 2022 Dec 7;23(24):15490. doi: 10.3390/ijms232415490.
4
Accurate Identification of DNA Replication Origin by Fusing Epigenomics and Chromatin Interaction Information.通过融合表观基因组学和染色质相互作用信息准确鉴定DNA复制起点
Research (Wash D C). 2022 Oct 29;2022:9780293. doi: 10.34133/2022/9780293. eCollection 2022.