• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用广义混沌博弈表示法鉴定抗癌肽

Identifying anticancer peptides by using a generalized chaos game representation.

作者信息

Ge Li, Liu Jiaguo, Zhang Yusen, Dehmer Matthias

机构信息

School of Mathematics and Statistics, Shandong University at Weihai, Weihai, 264209, China.

Department of Mechatronics and Biomedical Computer Science, UMIT, Hall in Tyrol, Austria.

出版信息

J Math Biol. 2019 Jan;78(1-2):441-463. doi: 10.1007/s00285-018-1279-x. Epub 2018 Oct 5.

DOI:10.1007/s00285-018-1279-x
PMID:30291366
Abstract

We generalize chaos game representation (CGR) to higher dimensional spaces while maintaining its bijection, keeping such method sufficiently representative and mathematically rigorous compare to previous attempts. We first state and prove the asymptotic property of CGR and our generalized chaos game representation (GCGR) method. The prediction follows that the dissimilarity of sequences which possess identical subsequences but distinct positions would be lowered exponentially by the length of the identical subsequence; this effect was taking place unbeknownst to researchers. By shining a spotlight on it now, we show the effect fundamentally supports (G)CGR as a similarity measure or feature extraction technique. We develop two feature extraction techniques: GCGR-Centroid and GCGR-Variance. We use the GCGR-Centroid to analyze the similarity between protein sequences by using the datasets 9 ND5, 24 TF and 50 beta-globin proteins. We obtain consistent results compared with previous studies which proves the significance thereof. Finally, by utilizing support vector machines, we train the anticancer peptide prediction model by using both GCGR-Centroid and GCGR-Variance, and achieve a significantly higher prediction performance by employing the 3 well-studied anticancer peptide datasets.

摘要

我们将混沌游戏表示(CGR)推广到更高维空间,同时保持其双射性,与之前的尝试相比,使该方法具有足够的代表性且在数学上更为严谨。我们首先阐述并证明了CGR和我们的广义混沌游戏表示(GCGR)方法的渐近性质。由此预测,具有相同子序列但位置不同的序列之间的差异会随着相同子序列长度的增加而呈指数下降;这种效应在研究人员不知情的情况下就已存在。现在通过关注这一点,我们表明该效应从根本上支持(G)CGR作为一种相似性度量或特征提取技术。我们开发了两种特征提取技术:GCGR - 质心和GCGR - 方差。我们使用GCGR - 质心,通过9个ND5、24个TF和50个β - 珠蛋白数据集来分析蛋白质序列之间的相似性。与之前的研究相比,我们获得了一致的结果,证明了其重要性。最后,通过利用支持向量机,我们使用GCGR - 质心和GCGR - 方差训练抗癌肽预测模型,并通过使用3个经过充分研究的抗癌肽数据集实现了显著更高的预测性能。

相似文献

1
Identifying anticancer peptides by using a generalized chaos game representation.利用广义混沌博弈表示法鉴定抗癌肽
J Math Biol. 2019 Jan;78(1-2):441-463. doi: 10.1007/s00285-018-1279-x. Epub 2018 Oct 5.
2
Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features.基于多视图特征融合的蛋白质亚细胞定位预测。
Molecules. 2019 Mar 6;24(5):919. doi: 10.3390/molecules24050919.
3
Analysis of genomic sequences by Chaos Game Representation.通过混沌游戏表示法分析基因组序列。
Bioinformatics. 2001 May;17(5):429-37. doi: 10.1093/bioinformatics/17.5.429.
4
A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector.基于新型组代表向量的蛋白质序列统计相似/相异分析。
Biomed Res Int. 2019 May 8;2019:8702968. doi: 10.1155/2019/8702968. eCollection 2019.
5
Deep learning on chaos game representation for proteins.基于混沌游戏表示的蛋白质深度学习。
Bioinformatics. 2020 Jan 1;36(1):272-279. doi: 10.1093/bioinformatics/btz493.
6
Similarity analysis for DNA sequences based on chaos game representation. Case study: the albumin.基于混沌游戏表示的 DNA 序列相似性分析。案例研究:白蛋白。
J Theor Biol. 2010 Dec 21;267(4):513-8. doi: 10.1016/j.jtbi.2010.09.027. Epub 2010 Sep 28.
7
Chaos game representation for comparison of whole genomes.用于全基因组比较的混沌游戏表示法。
BMC Bioinformatics. 2006 May 5;7:243. doi: 10.1186/1471-2105-7-243.
8
A novel fractal approach for predicting G-protein-coupled receptors and their subfamilies with support vector machines.一种结合支持向量机的用于预测G蛋白偶联受体及其亚家族的新型分形方法。
Biomed Mater Eng. 2015;26 Suppl 1:S1829-36. doi: 10.3233/BME-151485.
9
Structural class prediction of protein using novel feature extraction method from chaos game representation of predicted secondary structure.利用从预测二级结构的混沌博弈表示中提取的新特征方法对蛋白质进行结构类预测。
J Theor Biol. 2016 Jul 7;400:1-10. doi: 10.1016/j.jtbi.2016.04.011. Epub 2016 Apr 12.
10
A new graphical representation of similarity/dissimilarity studies of protein sequences.一种新的蛋白质序列相似/相异研究的图形表示方法。
SAR QSAR Environ Res. 2010 Jul;21(5-6):571-80. doi: 10.1080/1062936X.2010.510481.

引用本文的文献

1
On leveraging self-supervised learning for accurate HCV genotyping.利用自监督学习进行准确的 HCV 基因分型。
Sci Rep. 2024 Jul 5;14(1):15463. doi: 10.1038/s41598-024-64209-y.
2
Microbial characterization based on multifractal analysis of metagenomes.基于宏基因组多重分形分析的微生物特征描述。
Front Cell Infect Microbiol. 2023 Jan 26;13:1117421. doi: 10.3389/fcimb.2023.1117421. eCollection 2023.
3
ACP-ADA: A Boosting Method with Data Augmentation for Improved Prediction of Anticancer Peptides.ACP-ADA:一种基于数据增强的提升方法,用于改善抗癌肽的预测。

本文引用的文献

1
Identifying anticancer peptides by using improved hybrid compositions.使用改进的混合组合物鉴定抗癌肽。
Sci Rep. 2016 Sep 27;6:33910. doi: 10.1038/srep33910.
2
Numerical encoding of DNA sequences by chaos game representation with application in similarity comparison.基于混沌游戏表示的DNA序列数值编码及其在相似性比较中的应用
Genomics. 2016 Oct;108(3-4):134-142. doi: 10.1016/j.ygeno.2016.08.002. Epub 2016 Aug 15.
3
Protein sequence analysis by incorporating modified chaos game and physicochemical properties into Chou's general pseudo amino acid composition.
Int J Mol Sci. 2022 Oct 13;23(20):12194. doi: 10.3390/ijms232012194.
4
To Assist Oncologists: An Efficient Machine Learning-Based Approach for Anti-Cancer Peptides Classification.辅助肿瘤学家:一种基于机器学习的高效抗癌肽分类方法。
Sensors (Basel). 2022 May 25;22(11):4005. doi: 10.3390/s22114005.
5
GPCRs Are Optimal Regulators of Complex Biological Systems and Orchestrate the Interface between Health and Disease.G 蛋白偶联受体是复杂生物系统的最佳调节者,协调着健康和疾病之间的界面。
Int J Mol Sci. 2021 Dec 13;22(24):13387. doi: 10.3390/ijms222413387.
6
ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation.ACP-DA:利用数据增强改进抗癌肽的预测
Front Genet. 2021 Jun 30;12:698477. doi: 10.3389/fgene.2021.698477. eCollection 2021.
7
A chaotic viewpoint-based approach to solve haplotype assembly using hypergraph model.基于混沌观点的超图模型方法解决单体型组装问题。
PLoS One. 2020 Oct 29;15(10):e0241291. doi: 10.1371/journal.pone.0241291. eCollection 2020.
8
Encodings and models for antimicrobial peptide classification for multi-resistant pathogens.用于多重耐药病原体抗菌肽分类的编码与模型
BioData Min. 2019 Mar 4;12:7. doi: 10.1186/s13040-019-0196-x. eCollection 2019.
通过将改进的混沌博弈和物理化学性质纳入周氏广义伪氨基酸组成进行蛋白质序列分析。
J Theor Biol. 2016 Oct 7;406:105-15. doi: 10.1016/j.jtbi.2016.06.034. Epub 2016 Jun 29.
4
Phylo.io: Interactive Viewing and Comparison of Large Phylogenetic Trees on the Web.Phylo.io:在网络上对大型系统发育树进行交互式查看和比较。
Mol Biol Evol. 2016 Aug;33(8):2163-6. doi: 10.1093/molbev/msw080. Epub 2016 Apr 19.
5
iACP: a sequence-based tool for identifying anticancer peptides.iACP:一种用于鉴定抗癌肽的基于序列的工具。
Oncotarget. 2016 Mar 29;7(13):16895-909. doi: 10.18632/oncotarget.7815.
6
A high performance prediction of HPV genotypes by Chaos game representation and singular value decomposition.基于混沌博弈表示法和奇异值分解的人乳头瘤病毒基因型高性能预测
BMC Bioinformatics. 2015 Mar 5;16:71. doi: 10.1186/s12859-015-0493-4.
7
In silico models for designing and discovering novel anticancer peptides.用于设计和发现新型抗癌肽的计算模型。
Sci Rep. 2013 Oct 18;3:2984. doi: 10.1038/srep02984.
8
Predicting anticancer peptides with Chou's pseudo amino acid composition and investigating their mutagenicity via Ames test.用周氏伪氨基酸组成预测抗癌肽并通过艾姆斯试验研究其致突变性。
J Theor Biol. 2014 Jan 21;341:34-40. doi: 10.1016/j.jtbi.2013.08.037. Epub 2013 Sep 10.
9
Normalized feature vectors: a novel alignment-free sequence comparison method based on the numbers of adjacent amino acids.标准化特征向量:一种新颖的基于相邻氨基酸数量的无比对序列比较方法。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Mar-Apr;10(2):457-67. doi: 10.1109/TCBB.2013.10.
10
A 3D graphical representation of protein sequences based on the Gray code.基于格雷码的蛋白质序列三维图形表示。
J Theor Biol. 2012 Jul 7;304:81-7. doi: 10.1016/j.jtbi.2012.03.023. Epub 2012 Apr 1.