• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种蛋白质的新型数值表示:三维混沌博弈表示及其扩展自然向量。

A novel numerical representation for proteins: Three-dimensional Chaos Game Representation and its Extended Natural Vector.

作者信息

Sun Zeju, Pei Shaojun, He Rong Lucy, Yau Stephen S-T

机构信息

Department of Mathematical Sciences, Tsinghua University, Beijing, PR China.

Department of Biological Sciences, Chicago State University, Chicago, IL 60628, USA.

出版信息

Comput Struct Biotechnol J. 2020 Jul 15;18:1904-1913. doi: 10.1016/j.csbj.2020.07.004. eCollection 2020.

DOI:10.1016/j.csbj.2020.07.004
PMID:32774785
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7390779/
Abstract

Chaos Game Representation (CGR) was first proposed to be an image representation method of DNA and have been extended to the case of other biological macromolecules. Compared with the CGR images of DNA, where DNA sequences are converted into a series of points in the unit square, the existing CGR images of protein are not so elegant in geometry and the implications of the distribution of points in the CGR image are not so obvious. In this study, by naturally distributing the twenty amino acids on the vertices of a regular dodecahedron, we introduce a novel three-dimensional image representation of protein sequences with CGR method. We also associate each CGR image with a vector in high dimensional Euclidean space, called the extended natural vector (ENV), in order to analyze the information contained in the CGR images. Based on the results of protein classification and phylogenetic analysis, our method could serve as a precise method to discover biological relationships between proteins.

摘要

混沌游戏表示法(CGR)最初被提出作为一种DNA的图像表示方法,并已扩展到其他生物大分子的情况。与DNA的CGR图像不同,在DNA的CGR图像中,DNA序列被转换为单位正方形中的一系列点,现有的蛋白质CGR图像在几何形状上不那么优美,并且CGR图像中点的分布含义也不那么明显。在本研究中,通过将二十种氨基酸自然地分布在正十二面体的顶点上,我们用CGR方法引入了一种新的蛋白质序列三维图像表示。我们还将每个CGR图像与高维欧几里得空间中的一个向量相关联,称为扩展自然向量(ENV),以便分析CGR图像中包含的信息。基于蛋白质分类和系统发育分析的结果,我们的方法可以作为一种精确的方法来发现蛋白质之间的生物学关系。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/30b364f55ba4/gr8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/f95fdddc2c42/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/d21954c1bfa0/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/37ad72322217/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/7f3ccb0fd4f0/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/0689b09ad5b1/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/fcc0d814d46b/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/5571eff2607d/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/6e515e752985/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/30b364f55ba4/gr8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/f95fdddc2c42/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/d21954c1bfa0/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/37ad72322217/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/7f3ccb0fd4f0/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/0689b09ad5b1/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/fcc0d814d46b/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/5571eff2607d/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/6e515e752985/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d46a/7390779/30b364f55ba4/gr8.jpg

相似文献

1
A novel numerical representation for proteins: Three-dimensional Chaos Game Representation and its Extended Natural Vector.一种蛋白质的新型数值表示:三维混沌博弈表示及其扩展自然向量。
Comput Struct Biotechnol J. 2020 Jul 15;18:1904-1913. doi: 10.1016/j.csbj.2020.07.004. eCollection 2020.
2
Fast and accurate genome comparison using genome images: The Extended Natural Vector Method.使用基因组图像进行快速准确的基因组比较:扩展自然向量方法。
Mol Phylogenet Evol. 2019 Dec;141:106633. doi: 10.1016/j.ympev.2019.106633. Epub 2019 Sep 26.
3
Chaos game representation and its applications in bioinformatics.混沌游戏表示法及其在生物信息学中的应用。
Comput Struct Biotechnol J. 2021 Nov 10;19:6263-6271. doi: 10.1016/j.csbj.2021.11.008. eCollection 2021.
4
Predicting thermophilic proteins with pseudo amino acid composition:approached from chaos game representation and principal component analysis.基于伪氨基酸组成预测嗜热蛋白:从混沌博弈表示和主成分分析入手
Protein Pept Lett. 2011 Dec;18(12):1244-50. doi: 10.2174/092986611797642661.
5
Splice sites detection using chaos game representation and neural network.基于混沌游戏表示和神经网络的剪接位点检测。
Genomics. 2020 Mar;112(2):1847-1852. doi: 10.1016/j.ygeno.2019.10.018. Epub 2019 Nov 5.
6
Numerical encoding of DNA sequences by chaos game representation with application in similarity comparison.基于混沌游戏表示的DNA序列数值编码及其在相似性比较中的应用
Genomics. 2016 Oct;108(3-4):134-142. doi: 10.1016/j.ygeno.2016.08.002. Epub 2016 Aug 15.
7
Pattern matching through Chaos Game Representation: bridging numerical and discrete data structures for biological sequence analysis.通过混沌游戏表示法进行模式匹配:为生物序列分析搭建数字与离散数据结构之间的桥梁。
Algorithms Mol Biol. 2012 May 2;7(1):10. doi: 10.1186/1748-7188-7-10.
8
DCGR: feature extractions from protein sequences based on CGR via remodeling multiple information.基于 CGR 利用重塑多种信息对蛋白质序列进行特征提取
BMC Bioinformatics. 2019 Jun 20;20(1):351. doi: 10.1186/s12859-019-2943-x.
9
Detection of intra-family coronavirus genome sequences through graphical representation and artificial neural network.通过图形表示和人工神经网络检测家庭内部冠状病毒基因组序列
Expert Syst Appl. 2022 May 15;194:116559. doi: 10.1016/j.eswa.2022.116559. Epub 2022 Jan 21.
10
Identifying anticancer peptides by using a generalized chaos game representation.利用广义混沌博弈表示法鉴定抗癌肽
J Math Biol. 2019 Jan;78(1-2):441-463. doi: 10.1007/s00285-018-1279-x. Epub 2018 Oct 5.

引用本文的文献

1
Overview and Prospects of DNA Sequence Visualization.DNA序列可视化概述与展望
Int J Mol Sci. 2025 Jan 8;26(2):477. doi: 10.3390/ijms26020477.
2
On leveraging self-supervised learning for accurate HCV genotyping.利用自监督学习进行准确的 HCV 基因分型。
Sci Rep. 2024 Jul 5;14(1):15463. doi: 10.1038/s41598-024-64209-y.
3
CGRWDL: alignment-free phylogeny reconstruction method for viruses based on chaos game representation weighted by dynamical language model.CGRWDL:基于动态语言模型加权混沌博弈表示的病毒无比对系统发育重建方法

本文引用的文献

1
Fast and accurate genome comparison using genome images: The Extended Natural Vector Method.使用基因组图像进行快速准确的基因组比较:扩展自然向量方法。
Mol Phylogenet Evol. 2019 Dec;141:106633. doi: 10.1016/j.ympev.2019.106633. Epub 2019 Sep 26.
2
Large-Scale Genome Comparison Based on Cumulative Fourier Power and Phase Spectra: Central Moment and Covariance Vector.基于累积傅里叶功率和相位谱的大规模基因组比较:中心矩和协方差向量
Comput Struct Biotechnol J. 2019 Jul 11;17:982-994. doi: 10.1016/j.csbj.2019.07.003. eCollection 2019.
3
Deep learning on chaos game representation for proteins.
Front Microbiol. 2024 Mar 20;15:1339156. doi: 10.3389/fmicb.2024.1339156. eCollection 2024.
4
An accurate alignment-free protein sequence comparator based on physicochemical properties of amino acids.一种基于氨基酸理化性质的精确无对齐蛋白质序列比较器。
Sci Rep. 2022 Jul 1;12(1):11158. doi: 10.1038/s41598-022-15266-8.
5
Detection of intra-family coronavirus genome sequences through graphical representation and artificial neural network.通过图形表示和人工神经网络检测家庭内部冠状病毒基因组序列
Expert Syst Appl. 2022 May 15;194:116559. doi: 10.1016/j.eswa.2022.116559. Epub 2022 Jan 21.
6
Chaos game representation and its applications in bioinformatics.混沌游戏表示法及其在生物信息学中的应用。
Comput Struct Biotechnol J. 2021 Nov 10;19:6263-6271. doi: 10.1016/j.csbj.2021.11.008. eCollection 2021.
7
Clustering and classification of virus sequence through music communication protocol and wavelet transform.通过音乐通信协议和小波变换对病毒序列进行聚类和分类。
Genomics. 2021 Jan;113(1 Pt 2):778-784. doi: 10.1016/j.ygeno.2020.10.009. Epub 2020 Oct 16.
基于混沌游戏表示的蛋白质深度学习。
Bioinformatics. 2020 Jan 1;36(1):272-279. doi: 10.1093/bioinformatics/btz493.
4
Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features.基于多视图特征融合的蛋白质亚细胞定位预测。
Molecules. 2019 Mar 6;24(5):919. doi: 10.3390/molecules24050919.
5
Protein Sequence Classification Using Natural Vector and Convex Hull Method.基于自然向量和凸包方法的蛋白质序列分类
J Comput Biol. 2019 Apr;26(4):315-321. doi: 10.1089/cmb.2018.0216. Epub 2019 Feb 14.
6
An introduction to deep learning on biological sequence data: examples and solutions.深度学习在生物序列数据上的应用:实例与解决方案。
Bioinformatics. 2017 Nov 15;33(22):3685-3690. doi: 10.1093/bioinformatics/btx531.
7
3D representations of amino acids-applications to protein sequence comparison and classification.氨基酸的 3D 表示——在蛋白质序列比较和分类中的应用。
Comput Struct Biotechnol J. 2014 Sep 6;11(18):47-58. doi: 10.1016/j.csbj.2014.09.001. eCollection 2014 Aug.
8
Sequence analysis by iterated maps, a review.通过迭代映射进行序列分析,综述。
Brief Bioinform. 2014 May;15(3):369-75. doi: 10.1093/bib/bbt072. Epub 2013 Oct 25.
9
Protein space: a natural method for realizing the nature of protein universe.蛋白质空间:一种实现蛋白质宇宙本质的自然方法。
J Theor Biol. 2013 Feb 7;318:197-204. doi: 10.1016/j.jtbi.2012.11.005. Epub 2012 Nov 12.
10
A novel method of characterizing genetic sequences: genome space with biological distance and applications.一种新型的基因序列特征化方法:基因组空间与生物距离及其应用。
PLoS One. 2011 Mar 2;6(3):e17293. doi: 10.1371/journal.pone.0017293.