• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

iRSpot-SF:通过将基于序列的特征纳入到 Chou 的伪成分中预测重组热点。

iRSpot-SF: Prediction of recombination hotspots by incorporating sequence based features into Chou's Pseudo components.

机构信息

Department of Computer Science and Engineering, United International University, Madani Aveneue, Satarkul, Badda, Dhaka 1212, Bangladesh.

Department of Computer Science and Engineering, United International University, Madani Aveneue, Satarkul, Badda, Dhaka 1212, Bangladesh.

出版信息

Genomics. 2019 Jul;111(4):966-972. doi: 10.1016/j.ygeno.2018.06.003. Epub 2018 Jun 20.

DOI:10.1016/j.ygeno.2018.06.003
PMID:29935224
Abstract

Recombination hotspots in a genome are unevenly distributed. Hotspots are regions in a genome that show higher rates of meiotic recombinations. Computational methods for recombination hotspot prediction often use sophisticated features that are derived from physico-chemical or structure based properties of nucleotides. In this paper, we propose iRSpot-SF that uses sequence based features which are computationally cheap to generate. Four feature groups are used in our method: k-mer composition, gapped k-mer composition, TF-IDF of k-mers and reverse complement k-mer composition. We have used recursive feature elimination to select 17 top features for hotspot prediction. Our analysis shows the superiority of gapped k-mer composition and reverse complement k-mer composition features over others. We have used SVM with RBF kernel as a classification algorithm. We have tested our algorithm on standard benchmark datasets. Compared to other methods iRSpot-SF is able to produce significantly better results in terms of accuracy, Mathew's Correlation Coefficient and sensitivity which are 84.58%, 0.6941 and 84.57%. We have made our method readily available to use as a python based tool and made the datasets and source codes available at: https://github.com/abdlmaruf/iRSpot-SF. An web application is developed based on iRSpot-SF and freely available to use at: http://irspot.pythonanywhere.com/server.html.

摘要

基因组中的重组热点分布不均匀。热点是基因组中发生减数分裂重组率较高的区域。用于预测重组热点的计算方法通常使用源自核苷酸理化或结构特性的复杂特征。在本文中,我们提出了 iRSpot-SF,它使用基于序列的特征,这些特征的生成计算成本低廉。我们的方法使用了四个特征组:k-mer 组成、带隙 k-mer 组成、k-mer 的 TF-IDF 和反向互补 k-mer 组成。我们使用递归特征消除选择了用于热点预测的 17 个最佳特征。我们的分析表明,带隙 k-mer 组成和反向互补 k-mer 组成特征比其他特征具有优越性。我们使用具有 RBF 核的 SVM 作为分类算法。我们在标准基准数据集上测试了我们的算法。与其他方法相比,iRSpot-SF 在准确性、马修相关系数和灵敏度方面能够产生显著更好的结果,分别为 84.58%、0.6941 和 84.57%。我们已经将我们的方法作为一个基于 Python 的工具,使其易于使用,并在 https://github.com/abdlmaruf/iRSpot-SF 上提供数据集和源代码。还基于 iRSpot-SF 开发了一个 Web 应用程序,并免费供使用:http://irspot.pythonanywhere.com/server.html。

相似文献

1
iRSpot-SF: Prediction of recombination hotspots by incorporating sequence based features into Chou's Pseudo components.iRSpot-SF:通过将基于序列的特征纳入到 Chou 的伪成分中预测重组热点。
Genomics. 2019 Jul;111(4):966-972. doi: 10.1016/j.ygeno.2018.06.003. Epub 2018 Jun 20.
2
iRSpot-PDI: Identification of recombination spots by incorporating dinucleotide property diversity information into Chou's pseudo components.iRSpot-PDI:通过将二核苷酸特性多样性信息纳入 Chou 的伪分量来识别重组热点。
Genomics. 2019 May;111(3):457-464. doi: 10.1016/j.ygeno.2018.03.003. Epub 2018 Mar 13.
3
iRecSpot-EF: Effective sequence based features for recombination hotspot prediction.iRecSpot-EF:基于有效序列特征的重组热点预测。
Comput Biol Med. 2018 Dec 1;103:17-23. doi: 10.1016/j.compbiomed.2018.10.005. Epub 2018 Oct 11.
4
iRSpot-TNCPseAAC: identify recombination spots with trinucleotide composition and pseudo amino acid components.iRSpot-TNCPseAAC:利用三核苷酸组成和伪氨基酸成分识别重组位点。
Int J Mol Sci. 2014 Jan 24;15(2):1746-66. doi: 10.3390/ijms15021746.
5
iRSpot-Pse6NC: Identifying recombination spots in by incorporating hexamer composition into general PseKNC.iRSpot-Pse6NC:通过将六聚体组成纳入通用 PseKNC 来识别 中的重组热点。
Int J Biol Sci. 2018 May 22;14(8):883-891. doi: 10.7150/ijbs.24616. eCollection 2018.
6
iRSpot-ADPM: Identify recombination spots by incorporating the associated dinucleotide product model into Chou's pseudo components.iRSpot-ADPM:通过将相关二核苷酸产物模型纳入周氏伪组分来识别重组位点。
J Theor Biol. 2018 Mar 14;441:1-8. doi: 10.1016/j.jtbi.2017.12.025. Epub 2018 Jan 2.
7
iRSpot-DTS: Predict recombination spots by incorporating the dinucleotide-based spare-cross covariance information into Chou's pseudo components.iRSpot-DTS:通过将基于二核苷酸的空位交叉协方差信息纳入到周的伪分量中,来预测重组热点。
Genomics. 2019 Dec;111(6):1760-1770. doi: 10.1016/j.ygeno.2018.11.031. Epub 2018 Dec 6.
8
iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition.iRSpot-PseDNC:基于伪二核苷酸组成识别重组热点。
Nucleic Acids Res. 2013 Apr 1;41(6):e68. doi: 10.1093/nar/gks1450. Epub 2013 Jan 8.
9
LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms.LDsplit:通过分析 DNA 序列多态性筛选刺激减数分裂重组热点的顺式调控基序。
BMC Bioinformatics. 2014 Feb 17;15:48. doi: 10.1186/1471-2105-15-48.
10
A comparison and assessment of computational method for identifying recombination hotspots in Saccharomyces cerevisiae.一种比较和评估鉴定酿酒酵母重组热点的计算方法。
Brief Bioinform. 2020 Sep 25;21(5):1568-1580. doi: 10.1093/bib/bbz123.

引用本文的文献

1
K-mer-based Approaches to Bridging Pangenomics and Population Genetics.基于K-mer的泛基因组学与群体遗传学关联方法。
Mol Biol Evol. 2025 Mar 5;42(3). doi: 10.1093/molbev/msaf047.
2
Prediction of Recombination Spots Using Novel Hybrid Feature Extraction Method via Deep Learning Approach.通过深度学习方法使用新型混合特征提取方法预测重组位点
Front Genet. 2020 Sep 17;11:539227. doi: 10.3389/fgene.2020.539227. eCollection 2020.
3
Some illuminating remarks on molecular genetics and genomics as well as drug development.关于分子遗传学和基因组学以及药物开发的一些有启发性的观点。
Mol Genet Genomics. 2020 Mar;295(2):261-274. doi: 10.1007/s00438-019-01634-z. Epub 2020 Jan 1.