• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于混沌游戏表示和神经网络的剪接位点检测。

Splice sites detection using chaos game representation and neural network.

机构信息

Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, Chicago, IL 60607, USA.

Department of Mathematical Sciences, Tsinghua University, Beijing 100084, P.R. China.

出版信息

Genomics. 2020 Mar;112(2):1847-1852. doi: 10.1016/j.ygeno.2019.10.018. Epub 2019 Nov 5.

DOI:10.1016/j.ygeno.2019.10.018
PMID:31704313
Abstract

A novel method is proposed to detect the acceptor and donor splice sites using chaos game representation and artificial neural network. In order to achieve high accuracy, inputs to the neural network, or feature vector, shall reflect the true nature of the DNA segments. Therefore it is important to have one-to-one numerical representation, i.e. a feature vector should be able to represent the original data. Chaos game representation (CGR) is an iterative mapping technique that assigns each nucleotide in a DNA sequence to a respective position on the plane in a one-to-one manner. Using CGR, a DNA sequence can be mapped to a numerical sequence that reflects the true nature of the original sequence. In this research, we propose to use CGR as feature input to a neural network to detect splice sites on the NN269 dataset. Computational experiments indicate that this approach gives good accuracy while being simpler than other methods in the literature, with only one neural network component. The code and data for our method can be accessed from this link: https://github.com/thoang3/portfolio/tree/SpliceSites_ANN_CGR.

摘要

提出了一种使用混沌游戏表示和人工神经网络检测受体和供体位点的新方法。为了达到高精度,神经网络的输入,即特征向量,应反映 DNA 片段的真实性质。因此,重要的是要有一一对应的数值表示,即特征向量应该能够代表原始数据。混沌游戏表示(CGR)是一种迭代映射技术,它将 DNA 序列中的每个核苷酸一一对应地分配到平面上的相应位置。使用 CGR,DNA 序列可以映射到一个数值序列,该数值序列反映原始序列的真实性质。在这项研究中,我们建议使用 CGR 作为神经网络的特征输入,以在 NN269 数据集上检测剪接位点。计算实验表明,与文献中的其他方法相比,这种方法更简单,只有一个神经网络组件,具有很好的准确性。我们方法的代码和数据可以从以下链接访问:https://github.com/thoang3/portfolio/tree/SpliceSites_ANN_CGR。

相似文献

1
Splice sites detection using chaos game representation and neural network.基于混沌游戏表示和神经网络的剪接位点检测。
Genomics. 2020 Mar;112(2):1847-1852. doi: 10.1016/j.ygeno.2019.10.018. Epub 2019 Nov 5.
2
Numerical encoding of DNA sequences by chaos game representation with application in similarity comparison.基于混沌游戏表示的DNA序列数值编码及其在相似性比较中的应用
Genomics. 2016 Oct;108(3-4):134-142. doi: 10.1016/j.ygeno.2016.08.002. Epub 2016 Aug 15.
3
Detection of intra-family coronavirus genome sequences through graphical representation and artificial neural network.通过图形表示和人工神经网络检测家庭内部冠状病毒基因组序列
Expert Syst Appl. 2022 May 15;194:116559. doi: 10.1016/j.eswa.2022.116559. Epub 2022 Jan 21.
4
SpliceFinder: ab initio prediction of splice sites using convolutional neural network.SpliceFinder:使用卷积神经网络进行剪接位点的从头预测。
BMC Bioinformatics. 2019 Dec 27;20(Suppl 23):652. doi: 10.1186/s12859-019-3306-3.
5
A computational approach for prediction of donor splice sites with improved accuracy.一种提高准确性的预测供体剪接位点的计算方法。
J Theor Biol. 2016 Sep 7;404:285-294. doi: 10.1016/j.jtbi.2016.06.013. Epub 2016 Jun 11.
6
Encoding and Decoding DNA Sequences by Integer Chaos Game Representation.通过整数混沌游戏表示法对DNA序列进行编码和解码
J Comput Biol. 2019 Feb;26(2):143-151. doi: 10.1089/cmb.2018.0173. Epub 2018 Dec 5.
7
Analysis of genomic sequences by Chaos Game Representation.通过混沌游戏表示法分析基因组序列。
Bioinformatics. 2001 May;17(5):429-37. doi: 10.1093/bioinformatics/17.5.429.
8
Multifarious aspects of the chaos game representation and its applications in biological sequence analysis.混沌游戏表示法的多方面及其在生物序列分析中的应用。
Comput Biol Med. 2022 Dec;151(Pt A):106243. doi: 10.1016/j.compbiomed.2022.106243. Epub 2022 Oct 25.
9
A novel numerical representation for proteins: Three-dimensional Chaos Game Representation and its Extended Natural Vector.一种蛋白质的新型数值表示:三维混沌博弈表示及其扩展自然向量。
Comput Struct Biotechnol J. 2020 Jul 15;18:1904-1913. doi: 10.1016/j.csbj.2020.07.004. eCollection 2020.
10
Applying MSSIM combined chaos game representation to genome sequences analysis.应用 MSSIM 结合混沌游戏表示法进行基因组序列分析。
Genomics. 2018 May;110(3):180-190. doi: 10.1016/j.ygeno.2017.09.010. Epub 2017 Sep 21.

引用本文的文献

1
WalkIm: Compact image-based encoding for high-performance classification of biological sequences using simple tuning-free CNNs.WalkIm:使用简单的无调参 CNN 进行高性能生物序列分类的基于图像的紧凑编码。
PLoS One. 2022 Apr 15;17(4):e0267106. doi: 10.1371/journal.pone.0267106. eCollection 2022.
2
Detection of intra-family coronavirus genome sequences through graphical representation and artificial neural network.通过图形表示和人工神经网络检测家庭内部冠状病毒基因组序列
Expert Syst Appl. 2022 May 15;194:116559. doi: 10.1016/j.eswa.2022.116559. Epub 2022 Jan 21.
3
Chaos game representation and its applications in bioinformatics.
混沌游戏表示法及其在生物信息学中的应用。
Comput Struct Biotechnol J. 2021 Nov 10;19:6263-6271. doi: 10.1016/j.csbj.2021.11.008. eCollection 2021.
4
Comparative analysis and prediction of nucleosome positioning using integrative feature representation and machine learning algorithms.基于综合特征表示和机器学习算法的核小体定位的比较分析和预测。
BMC Bioinformatics. 2021 Jun 2;22(Suppl 6):129. doi: 10.1186/s12859-021-04006-w.
5
Clustering and classification of virus sequence through music communication protocol and wavelet transform.通过音乐通信协议和小波变换对病毒序列进行聚类和分类。
Genomics. 2021 Jan;113(1 Pt 2):778-784. doi: 10.1016/j.ygeno.2020.10.009. Epub 2020 Oct 16.