iRO-PsekGCC：基于伪k元组GC含量识别DNA复制起点

iRO-PsekGCC: Identify DNA Replication Origins Based on Pseudo k-Tuple GC Composition.

作者信息

Liu Bin, Chen Shengyu, Yan Ke, Weng Fan

机构信息

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China.

Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China.

出版信息

Front Genet. 2019 Sep 18;10:842. doi: 10.3389/fgene.2019.00842. eCollection 2019.

DOI:10.3389/fgene.2019.00842

PMID:31620165

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6759546/

Abstract

Identification of replication origins is playing a key role in understanding the mechanism of DNA replication. This task is of great significance in DNA sequence analysis. Because of its importance, some computational approaches have been introduced. Among these predictors, the iRO-3wPseKNC predictor is the first discriminative method that is able to correctly identify the entire replication origins. For further improving its predictive performance, we proposed the Pseudo k-tuple GC Composition (PsekGCC) approach to capture the "GC asymmetry bias" of yeast species by considering both the GC skew and the sequence order effects of -tuple GC Composition (-GCC) in this study. Based on PseKGCC, we proposed a new predictor called iRO-PsekGCC to identify the DNA replication origins. Rigorous jackknife test on two yeast species benchmark datasets (, ) indicated that iRO-PsekGCC outperformed iRO-3wPseKNC. It can be anticipated that iRO-PsekGCC will be a useful tool for DNA replication origin identification. The web-server for the iRO-PsekGCC predictor was established, and it can be accessed at http://bliulab.net/iRO-PsekGCC/.

摘要

复制起点的识别在理解DNA复制机制中起着关键作用。这项任务在DNA序列分析中具有重要意义。由于其重要性，已经引入了一些计算方法。在这些预测器中，iRO-3wPseKNC预测器是第一种能够正确识别整个复制起点的判别方法。为了进一步提高其预测性能，在本研究中，我们提出了伪k元组GC组成（PsekGCC）方法，通过同时考虑GC偏斜和k元组GC组成（-GCC）的序列顺序效应来捕获酵母物种的“GC不对称偏差”。基于PseKGCC，我们提出了一种名为iRO-PsekGCC的新预测器来识别DNA复制起点。对两个酵母物种基准数据集（，）进行的严格留一法测试表明，iRO-PsekGCC的性能优于iRO-3wPseKNC。可以预期，iRO-PsekGCC将成为识别DNA复制起点的有用工具。已建立了iRO-PsekGCC预测器的网络服务器，可通过http://bliulab.net/iRO-PsekGCC/访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d10/6759546/6ba004ba5238/fgene-10-00842-g001.jpg

相似文献

iRO-PsekGCC: Identify DNA Replication Origins Based on Pseudo k-Tuple GC Composition.iRO-PsekGCC：基于伪k元组GC含量识别DNA复制起点

Front Genet. 2019 Sep 18;10:842. doi: 10.3389/fgene.2019.00842. eCollection 2019.

iRO-3wPseKNC: identify DNA replication origins by three-window-based PseKNC.iRO-3wPseKNC：通过三窗口 PseKNC 识别 DNA 复制起点。

Bioinformatics. 2018 Sep 15;34(18):3086-3093. doi: 10.1093/bioinformatics/bty312.

iOri-Human: identify human origin of replication by incorporating dinucleotide physicochemical properties into pseudo nucleotide composition.iOri-Human：通过将二核苷酸物理化学性质纳入伪核苷酸组成来识别人类复制起点。

Oncotarget. 2016 Oct 25;7(43):69783-69793. doi: 10.18632/oncotarget.11975.

iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition.iNuc-PseKNC：一种基于序列的预测器，用于预测基因组中具有伪 k-元核苷酸组成的核小体定位。

Bioinformatics. 2014 Jun 1;30(11):1522-9. doi: 10.1093/bioinformatics/btu083. Epub 2014 Feb 6.

Asymmetry indices for analysis and prediction of replication origins in eukaryotic genomes.用于分析和预测真核基因组复制起点的不对称指数。

PLoS One. 2012;7(9):e45050. doi: 10.1371/journal.pone.0045050. Epub 2012 Sep 27.

PseKNC: a flexible web server for generating pseudo K-tuple nucleotide composition.PseKNC：一个用于生成伪K元核苷酸组成的灵活网络服务器。

Anal Biochem. 2014 Jul 1;456:53-60. doi: 10.1016/j.ab.2014.04.001. Epub 2014 Apr 13.

A computational platform to identify origins of replication sites in eukaryotes.一种用于鉴定真核生物复制起始位点的计算平台。

Brief Bioinform. 2021 Mar 22;22(2):1940-1950. doi: 10.1093/bib/bbaa017.

iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition.iEnhancer-2L：一种通过伪 k-元核苷酸组成识别增强子及其强度的两层预测器。

Bioinformatics. 2016 Feb 1;32(3):362-9. doi: 10.1093/bioinformatics/btv604. Epub 2015 Oct 17.

Recent advances in the genome-wide study of DNA replication origins in yeast.酵母中DNA复制起点全基因组研究的最新进展

Front Microbiol. 2015 Feb 19;6:117. doi: 10.3389/fmicb.2015.00117. eCollection 2015.

Sequence analysis of origins of replication in the Saccharomyces cerevisiae genomes.酿酒酵母基因组中复制起点的序列分析。

Front Microbiol. 2014 Nov 18;5:574. doi: 10.3389/fmicb.2014.00574. eCollection 2014.

引用本文的文献

A GHKNN model based on the physicochemical property extraction method to identify SNARE proteins.一种基于物理化学性质提取方法的GHKNN模型，用于识别SNARE蛋白。

Front Genet. 2022 Nov 23;13:935717. doi: 10.3389/fgene.2022.935717. eCollection 2022.

A deep learning framework combined with word embedding to identify DNA replication origins.深度学习框架结合词嵌入技术识别 DNA 复制起点

Sci Rep. 2021 Jan 12;11(1):844. doi: 10.1038/s41598-020-80670-x.

Computational prediction of species-specific yeast DNA replication origin via iterative feature representation.通过迭代特征表示计算预测物种特异性酵母 DNA 复制原点。

Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa304.

Identifying Antioxidant Proteins by Using Amino Acid Composition and Protein-Protein Interactions.利用氨基酸组成和蛋白质-蛋白质相互作用鉴定抗氧化蛋白

Front Cell Dev Biol. 2020 Oct 29;8:591487. doi: 10.3389/fcell.2020.591487. eCollection 2020.

Identification and Classification of Enhancers Using Dimension Reduction Technique and Recurrent Neural Network.利用降维技术和递归神经网络鉴定和分类增强子

Comput Math Methods Med. 2020 Oct 18;2020:8852258. doi: 10.1155/2020/8852258. eCollection 2020.

Prediction of Anticancer Peptides Using a Low-Dimensional Feature Model.使用低维特征模型预测抗癌肽

Front Bioeng Biotechnol. 2020 Aug 12;8:892. doi: 10.3389/fbioe.2020.00892. eCollection 2020.

Prediction of G Protein-Coupled Receptors With CTDC Extraction and MRMD2.0 Dimension-Reduction Methods.基于CTDC提取和MRMD2.0降维方法的G蛋白偶联受体预测

Front Bioeng Biotechnol. 2020 Jun 25;8:635. doi: 10.3389/fbioe.2020.00635. eCollection 2020.

A Bioinformatics Tool for the Prediction of DNA N6-Methyladenine Modifications Based on Feature Fusion and Optimization Protocol.一种基于特征融合与优化协议的DNA N6-甲基腺嘌呤修饰预测的生物信息学工具。

Front Bioeng Biotechnol. 2020 Jun 4;8:502. doi: 10.3389/fbioe.2020.00502. eCollection 2020.

Its2vec: Fungal Species Identification Using Sequence Embedding and Random Forest Classification.Its2vec：基于序列嵌入和随机森林分类的真菌物种鉴定。

Biomed Res Int. 2020 May 27;2020:2468789. doi: 10.1155/2020/2468789. eCollection 2020.

本文引用的文献

DeepSVM-fold: protein fold recognition by combining support vector machines and pairwise sequence similarity scores generated by deep learning networks.DeepSVM-fold：通过结合支持向量机和深度学习网络生成的成对序列相似性得分来进行蛋白质折叠识别。

Brief Bioinform. 2020 Sep 25;21(5):1733-1741. doi: 10.1093/bib/bbz098.

BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches.BioSeq-Analysis2.0：一个基于机器学习方法的更新平台，用于在序列水平和残基水平上分析 DNA、RNA 和蛋白质序列。

Nucleic Acids Res. 2019 Nov 18;47(20):e127. doi: 10.1093/nar/gkz740.

Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response.深度呼吸森林：一种用于预测抗癌药物反应的深度森林模型。

Methods. 2019 Aug 15;166:91-102. doi: 10.1016/j.ymeth.2019.02.009. Epub 2019 Feb 14.

i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome.i6mA-Pred：鉴定水稻基因组中的 DNA N6-甲基腺嘌呤位点。

Bioinformatics. 2019 Aug 15;35(16):2796-2800. doi: 10.1093/bioinformatics/btz015.

iEnhancer-EL: identifying enhancers and their strength with ensemble learning approach.iEnhancer-EL：基于集成学习方法识别增强子及其强度。

Bioinformatics. 2018 Nov 15;34(22):3835-3842. doi: 10.1093/bioinformatics/bty458.

iRO-3wPseKNC: identify DNA replication origins by three-window-based PseKNC.iRO-3wPseKNC：通过三窗口 PseKNC 识别 DNA 复制起点。

Bioinformatics. 2018 Sep 15;34(18):3086-3093. doi: 10.1093/bioinformatics/bty312.

BioSeq-Analysis: a platform for DNA, RNA and protein sequence analysis based on machine learning approaches.生物序列分析：一个基于机器学习方法的 DNA、RNA 和蛋白质序列分析平台。

Brief Bioinform. 2019 Jul 19;20(4):1280-1294. doi: 10.1093/bib/bbx165.

ProtDec-LTR2.0: an improved method for protein remote homology detection by combining pseudo protein and supervised Learning to Rank.ProtDec-LTR2.0：一种通过结合伪蛋白质和有监督学习排序来改进蛋白质远程同源性检测的方法。

Bioinformatics. 2017 Nov 1;33(21):3473-3476. doi: 10.1093/bioinformatics/btx429.

Oral microbial community assembly under the influence of periodontitis.牙周炎影响下的口腔微生物群落组装

PLoS One. 2017 Aug 16;12(8):e0182259. doi: 10.1371/journal.pone.0182259. eCollection 2017.

A comprehensive review and comparison of different computational methods for protein remote homology detection.蛋白质远程同源检测不同计算方法的综合回顾与比较。

Brief Bioinform. 2018 Mar 1;19(2):231-244. doi: 10.1093/bib/bbw108.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

iRO-PsekGCC：基于伪k元组GC含量识别DNA复制起点

iRO-PsekGCC: Identify DNA Replication Origins Based on Pseudo k-Tuple GC Composition.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献