• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

iRSpot-GAEnsC:通过集成分类器识别重组位点并扩展周氏伪氨基酸组成概念以构建DNA样本

iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou's PseAAC to formulate DNA samples.

作者信息

Kabir Muhammad, Hayat Maqsood

机构信息

Department of Computer Science, Abdul Wali Khan University, Mardan, KP, Pakistan.

出版信息

Mol Genet Genomics. 2016 Feb;291(1):285-96. doi: 10.1007/s00438-015-1108-5. Epub 2015 Aug 30.

DOI:10.1007/s00438-015-1108-5
PMID:26319782
Abstract

Meiotic recombination is vital for maintaining the sequence diversity in human genome. Meiosis and recombination are considered the essential phases of cell division. In meiosis, the genome is divided into equal parts for sexual reproduction whereas in recombination, the diverse genomes are combined to form new combination of genetic variations. Recombination process does not occur randomly across the genomes, it targets specific areas called recombination "hotspots" and "coldspots". Owing to huge exploration of polygenetic sequences in data banks, it is impossible to recognize the sequences through conventional methods. Looking at the significance of recombination spots, it is indispensable to develop an accurate, fast, robust, and high-throughput automated computational model. In this model, the numerical descriptors are extracted using two sequence representation schemes namely: dinucleotide composition and trinucleotide composition. The performances of seven classification algorithms were investigated. Finally, the predicted outcomes of individual classifiers are fused to form ensemble classification, which is formed through majority voting and genetic algorithm (GA). The performance of GA-based ensemble model is quite promising compared to individual classifiers and majority voting-based ensemble model. iRSpot-GAEnsC has achieved 84.46 % accuracy. The empirical results revealed that the performance of iRSpot-GAEnsC is not only higher than the examined algorithms but also better than existing methods in the literature developed so far. It is anticipated that the proposed model might be helpful for research community, academia and for drug discovery.

摘要

减数分裂重组对于维持人类基因组中的序列多样性至关重要。减数分裂和重组被认为是细胞分裂的重要阶段。在减数分裂中,基因组被等分为用于有性生殖的部分,而在重组中,不同的基因组被组合形成新的遗传变异组合。重组过程并非在基因组中随机发生,它针对特定区域,即所谓的重组“热点”和“冷点”。由于数据库中多基因序列的大量探索,通过传统方法识别这些序列是不可能的。鉴于重组位点的重要性,开发一种准确、快速、稳健且高通量的自动化计算模型是必不可少的。在该模型中,使用两种序列表示方案提取数值描述符,即:二核苷酸组成和三核苷酸组成。研究了七种分类算法的性能。最后,将各个分类器的预测结果融合形成集成分类,这是通过多数投票和遗传算法(GA)形成的。与单个分类器和基于多数投票的集成模型相比,基于GA的集成模型的性能非常有前景。iRSpot - GAEnsC的准确率达到了84.46%。实证结果表明,iRSpot - GAEnsC的性能不仅高于所研究的算法,而且优于迄今为止文献中已有的方法。预计所提出的模型可能对研究团体、学术界以及药物发现有所帮助。

相似文献

1
iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou's PseAAC to formulate DNA samples.iRSpot-GAEnsC:通过集成分类器识别重组位点并扩展周氏伪氨基酸组成概念以构建DNA样本
Mol Genet Genomics. 2016 Feb;291(1):285-96. doi: 10.1007/s00438-015-1108-5. Epub 2015 Aug 30.
2
iRSpot-TNCPseAAC: identify recombination spots with trinucleotide composition and pseudo amino acid components.iRSpot-TNCPseAAC:利用三核苷酸组成和伪氨基酸成分识别重组位点。
Int J Mol Sci. 2014 Jan 24;15(2):1746-66. doi: 10.3390/ijms15021746.
3
iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space.iACP - GAEnsC:基于进化遗传算法的利用混合特征空间对抗癌肽进行集成分类
Artif Intell Med. 2017 Jun;79:62-70. doi: 10.1016/j.artmed.2017.06.008. Epub 2017 Jun 17.
4
iRSpot-DTS: Predict recombination spots by incorporating the dinucleotide-based spare-cross covariance information into Chou's pseudo components.iRSpot-DTS:通过将基于二核苷酸的空位交叉协方差信息纳入到周的伪分量中,来预测重组热点。
Genomics. 2019 Dec;111(6):1760-1770. doi: 10.1016/j.ygeno.2018.11.031. Epub 2018 Dec 6.
5
RF-DYMHC: detecting the yeast meiotic recombination hotspots and coldspots by random forest model using gapped dinucleotide composition features.RF-DYMHC:利用含间隙二核苷酸组成特征的随机森林模型检测酵母减数分裂重组热点和冷点
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W47-51. doi: 10.1093/nar/gkm217. Epub 2007 May 3.
6
iRSpot-ADPM: Identify recombination spots by incorporating the associated dinucleotide product model into Chou's pseudo components.iRSpot-ADPM:通过将相关二核苷酸产物模型纳入周氏伪组分来识别重组位点。
J Theor Biol. 2018 Mar 14;441:1-8. doi: 10.1016/j.jtbi.2017.12.025. Epub 2018 Jan 2.
7
iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition.iRSpot-PseDNC:基于伪二核苷酸组成识别重组热点。
Nucleic Acids Res. 2013 Apr 1;41(6):e68. doi: 10.1093/nar/gks1450. Epub 2013 Jan 8.
8
iNuc-STNC: a sequence-based predictor for identification of nucleosome positioning in genomes by extending the concept of SAAC and Chou's PseAAC.iNuc-STNC:一种基于序列的预测器,通过扩展SAAC和周式伪氨基酸组成的概念来识别基因组中的核小体定位。
Mol Biosyst. 2016 Jul 19;12(8):2587-93. doi: 10.1039/c6mb00221h.
9
iRSpot-Pse6NC: Identifying recombination spots in by incorporating hexamer composition into general PseKNC.iRSpot-Pse6NC:通过将六聚体组成纳入通用 PseKNC 来识别 中的重组热点。
Int J Biol Sci. 2018 May 22;14(8):883-891. doi: 10.7150/ijbs.24616. eCollection 2018.
10
iRSpot-EL: identify recombination spots with an ensemble learning approach.iRSpot-EL:基于集成学习方法识别重组热点。
Bioinformatics. 2017 Jan 1;33(1):35-41. doi: 10.1093/bioinformatics/btw539. Epub 2016 Aug 16.

引用本文的文献

1
Identification of intelligence-related proteins through a robust two-layer predictor.通过强大的双层预测器鉴定与智力相关的蛋白质。
Commun Integr Biol. 2022 Nov 15;15(1):253-264. doi: 10.1080/19420889.2022.2143101. eCollection 2022.
2
iAcety-SmRF: Identification of Acetylation Protein by Using Statistical Moments and Random Forest.iAcety-SmRF:利用统计矩和随机森林鉴定乙酰化蛋白
Membranes (Basel). 2022 Feb 25;12(3):265. doi: 10.3390/membranes12030265.
3
Prediction of Recombination Spots Using Novel Hybrid Feature Extraction Method via Deep Learning Approach.

本文引用的文献

1
Identification of Heat Shock Protein families and J-protein types by incorporating Dipeptide Composition into Chou's general PseAAC.通过将二肽组成纳入周的通用 PseAAC,鉴定热休克蛋白家族和 J 蛋白类型。
Comput Methods Programs Biomed. 2015 Nov;122(2):165-74. doi: 10.1016/j.cmpb.2015.07.005. Epub 2015 Jul 22.
2
Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences.伪核苷酸组成或PseKNC:一种用于分析基因组序列的有效方法。
Mol Biosyst. 2015 Oct;11(10):2620-34. doi: 10.1039/c5mb00155b.
3
repRNA: a web server for generating various feature vectors of RNA sequences.
通过深度学习方法使用新型混合特征提取方法预测重组位点
Front Genet. 2020 Sep 17;11:539227. doi: 10.3389/fgene.2020.539227. eCollection 2020.
4
Identify Lysine Neddylation Sites Using Bi-profile Bayes Feature Extraction the Chou's 5-steps Rule and General Pseudo Components.使用双轮廓贝叶斯特征提取、周氏五步法则和广义伪组分鉴定赖氨酸N-乙酰化位点。
Curr Genomics. 2019 Dec;20(8):592-601. doi: 10.2174/1389202921666191223154629.
5
iSulfoTyr-PseAAC: Identify Tyrosine Sulfation Sites by Incorporating Statistical Moments Chou's 5-steps Rule and Pseudo Components.iSulfoTyr-PseAAC:通过结合统计矩、周氏五步法则和伪组分来识别酪氨酸硫酸化位点
Curr Genomics. 2019 May;20(4):306-320. doi: 10.2174/1389202920666190819091609.
6
Some illuminating remarks on molecular genetics and genomics as well as drug development.关于分子遗传学和基因组学以及药物开发的一些有启发性的观点。
Mol Genet Genomics. 2020 Mar;295(2):261-274. doi: 10.1007/s00438-019-01634-z. Epub 2020 Jan 1.
7
RAACBook: a web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule.RAACBook:一个基于简化氨基酸字母表的网络服务器,用于通过使用周保罗的五步法则进行序列相关推断。
Database (Oxford). 2019 Jan 1;2019. doi: 10.1093/database/baz131.
8
iCrotoK-PseAAC: Identify lysine crotonylation sites by blending position relative statistical features according to the Chou's 5-step rule.iCrotoK-PseAAC:根据周的五步规则,通过混合位置相对统计特征来识别赖氨酸巴豆酰化位点。
PLoS One. 2019 Nov 21;14(11):e0223993. doi: 10.1371/journal.pone.0223993. eCollection 2019.
9
iPseU-CNN: Identifying RNA Pseudouridine Sites Using Convolutional Neural Networks.iPseU-CNN:使用卷积神经网络识别RNA假尿苷位点。
Mol Ther Nucleic Acids. 2019 Jun 7;16:463-470. doi: 10.1016/j.omtn.2019.03.010. Epub 2019 Apr 11.
10
MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters.MULTiPly:一种用于发现通用和特定类型启动子的新型多层预测器。
Bioinformatics. 2019 Sep 1;35(17):2957-2965. doi: 10.1093/bioinformatics/btz016.
repRNA:一个用于生成RNA序列各种特征向量的网络服务器。
Mol Genet Genomics. 2016 Feb;291(1):473-81. doi: 10.1007/s00438-015-1078-7. Epub 2015 Jun 18.
4
iCataly-PseAAC: Identification of Enzymes Catalytic Sites Using Sequence Evolution Information with Grey Model GM (2,1).iCataly-PseAAC:基于灰色模型GM(2,1)利用序列进化信息识别酶的催化位点
J Membr Biol. 2015 Dec;248(6):1033-41. doi: 10.1007/s00232-015-9815-8. Epub 2015 Jun 16.
5
TargetFreeze: Identifying Antifreeze Proteins via a Combination of Weights using Sequence Evolutionary Information and Pseudo Amino Acid Composition.TargetFreeze:通过结合使用序列进化信息和伪氨基酸组成的权重来鉴定抗冻蛋白
J Membr Biol. 2015 Dec;248(6):1005-14. doi: 10.1007/s00232-015-9811-z. Epub 2015 Jun 10.
6
PSOFuzzySVM-TMH: identification of transmembrane helix segments using ensemble feature space by incorporated fuzzy support vector machine.PSOFuzzySVM-TMH:通过合并模糊支持向量机利用集成特征空间识别跨膜螺旋片段
Mol Biosyst. 2015 Aug;11(8):2255-62. doi: 10.1039/c5mb00196j.
7
Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences.Pse-in-One:一个用于生成DNA、RNA和蛋白质序列各种伪组件模式的网络服务器。
Nucleic Acids Res. 2015 Jul 1;43(W1):W65-71. doi: 10.1093/nar/gkv458. Epub 2015 May 9.
8
iPPI-Esml: An ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into PseAAC.iPPI-Esml:一种通过将蛋白质的物理化学性质和小波变换纳入伪氨基酸组成来识别蛋白质相互作用的集成分类器。
J Theor Biol. 2015 Jul 21;377:47-56. doi: 10.1016/j.jtbi.2015.04.011. Epub 2015 Apr 20.
9
Identification of real microRNA precursors with a pseudo structure status composition approach.采用伪结构状态组成方法鉴定真实的微小RNA前体。
PLoS One. 2015 Mar 30;10(3):e0121501. doi: 10.1371/journal.pone.0121501. eCollection 2015.
10
iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach.iMiRNA-PseDPC:基于伪距离对组合方法的 microRNA 前体识别。
J Biomol Struct Dyn. 2016;34(1):223-35. doi: 10.1080/07391102.2015.1014422. Epub 2015 Mar 3.