• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

A hybrid method to cluster protein sequences based on statistics and artificial neural networks.

作者信息

Ferrán E A, Pflugfelder B

机构信息

Sanofi Elf Bio Recherches, Labège Innopole, France.

出版信息

Comput Appl Biosci. 1993 Dec;9(6):671-80. doi: 10.1093/bioinformatics/9.6.671.

DOI:10.1093/bioinformatics/9.6.671
PMID:8143153
Abstract

We have recently proposed a method, based on artificial neural networks (ANNs) to cluster protein sequences into families according to their degree of sequence similarity. The network was trained with an unsupervised learning algorithm, using, as inputs, matrix patterns derived from the bipeptide composition of the protein sequences. We describe here some further improvements to that approach. First, we propose a statistical method to cluster a set of bipeptidic matrices into families. It consists of three stages: (i) principal component analysis, (ii) determination of the optimal number M of clusters and (iii) final classification of the bipeptidic matrices into M clusters. Using a set of 444 protein sequences, we show that the classification given by the statistical method is in agreement with biological knowledge. We also show that the resulting classification is very similar to the one previously obtained with the ANN approach. Finally, we propose a new hybrid method of the statistical and ANN approaches, in which the results of the statistical method are used to choose the number of neurons and inputs of the network. We show that a network built in this way, and fed with a few principal components of the set of bipeptidic matrices as input signals, can be trained in an extremely short computing time. The resulting topological maps do not essentially differ from the ones obtained with the initial ANN approach.

摘要

相似文献

1
A hybrid method to cluster protein sequences based on statistics and artificial neural networks.
Comput Appl Biosci. 1993 Dec;9(6):671-80. doi: 10.1093/bioinformatics/9.6.671.
2
Protein classification using neural networks.
Proc Int Conf Intell Syst Mol Biol. 1993;1:127-35.
3
Clustering proteins into families using artificial neural networks.使用人工神经网络将蛋白质聚类成家族。
Comput Appl Biosci. 1992 Feb;8(1):39-44. doi: 10.1093/bioinformatics/8.1.39.
4
Self-organized neural maps of human protein sequences.人类蛋白质序列的自组织神经图谱。
Protein Sci. 1994 Mar;3(3):507-21. doi: 10.1002/pro.5560030316.
5
FuzzyART neural network for protein classification.用于蛋白质分类的模糊ART神经网络。
J Bioinform Comput Biol. 2010 Oct;8(5):825-41. doi: 10.1142/s0219720010004951.
6
Topological maps of protein sequences.蛋白质序列的拓扑图
Biol Cybern. 1991;65(6):451-8. doi: 10.1007/BF00204658.
7
Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research.人工神经网络(ANN)建模的基本概念及其在药物研究中的应用。
J Pharm Biomed Anal. 2000 Jun;22(5):717-27. doi: 10.1016/s0731-7085(99)00272-1.
8
Channel selection and classification of electroencephalogram signals: an artificial neural network and genetic algorithm-based approach.脑电信号的通道选择与分类:基于人工神经网络和遗传算法的方法。
Artif Intell Med. 2012 Jun;55(2):117-26. doi: 10.1016/j.artmed.2012.02.001. Epub 2012 Apr 12.
9
On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
10
Optimal classification of protein sequences and selection of representative sets from multiple alignments: application to homologous families and lessons for structural genomics.蛋白质序列的最佳分类及从多序列比对中选择代表性序列集:在同源家族中的应用及对结构基因组学的启示
Protein Eng. 2001 Apr;14(4):209-17. doi: 10.1093/protein/14.4.209.

引用本文的文献

1
Self-organizing tree-growing network for the classification of protein sequences.用于蛋白质序列分类的自组织树生长网络
Protein Sci. 1998 Dec;7(12):2613-22. doi: 10.1002/pro.5560071215.
2
Self-organized neural maps of human protein sequences.人类蛋白质序列的自组织神经图谱。
Protein Sci. 1994 Mar;3(3):507-21. doi: 10.1002/pro.5560030316.