• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用人工神经网络将蛋白质聚类成家族。

Clustering proteins into families using artificial neural networks.

作者信息

Ferrán E A, Ferrara P

机构信息

Sanofi Elf Bio Recherches, Labège Innopole, France.

出版信息

Comput Appl Biosci. 1992 Feb;8(1):39-44. doi: 10.1093/bioinformatics/8.1.39.

DOI:10.1093/bioinformatics/8.1.39
PMID:1314686
Abstract

An artificial neural network was used to cluster proteins into families. The network, composed of 7 x 7 neurons, was trained with the Kohonen unsupervised learning algorithm using, as inputs, matrix patterns derived from the bipeptide composition of 447 proteins, belonging to 13 different families. As a result of the training, and without any a priori indication of the number or composition of the expected families, the network self-organized the activation of its neurons into topologically ordered maps in which almost all the proteins (96.7%) were correctly clustered into the corresponding families. In a second computational experiment, a similar network was trained with one family of the previous learning set (76 cytochrome c sequences). The new neural map clustered these proteins into 25 different neurons (five in the first experiment), wherein phylogenetically related sequences were positioned close to each other. This result shows that the network can adapt the clustering resolution to the complexity of the learning set, a useful feature when working with an unknown number of clusters. Although the learning stage is time consuming, once the topological map is obtained, the classification of new proteins is very fast. Altogether, our results suggest that this novel approach may be a useful tool to organize the search for homologies in large macromolecular databases.

摘要

使用人工神经网络将蛋白质聚类成家族。该网络由7×7个神经元组成,采用Kohonen无监督学习算法进行训练,其输入是从属于13个不同家族的447种蛋白质的双肽组成衍生而来的矩阵模式。训练的结果是,在没有关于预期家族数量或组成的任何先验指示的情况下,网络将其神经元的激活自组织成拓扑有序图,其中几乎所有蛋白质(96.7%)都被正确聚类到相应家族中。在第二个计算实验中,用前一个学习集的一个家族(76个细胞色素c序列)对类似的网络进行训练。新的神经图将这些蛋白质聚类到25个不同的神经元中(在第一个实验中有5个),其中系统发育相关的序列彼此靠近定位。这一结果表明,该网络可以使聚类分辨率适应学习集的复杂性,这在处理未知数量的聚类时是一个有用的特性。虽然学习阶段很耗时,但一旦获得拓扑图,新蛋白质的分类就非常快。总之,我们的结果表明,这种新方法可能是在大型大分子数据库中组织同源性搜索的有用工具。

相似文献

1
Clustering proteins into families using artificial neural networks.使用人工神经网络将蛋白质聚类成家族。
Comput Appl Biosci. 1992 Feb;8(1):39-44. doi: 10.1093/bioinformatics/8.1.39.
2
Protein classification using neural networks.
Proc Int Conf Intell Syst Mol Biol. 1993;1:127-35.
3
A hybrid method to cluster protein sequences based on statistics and artificial neural networks.
Comput Appl Biosci. 1993 Dec;9(6):671-80. doi: 10.1093/bioinformatics/9.6.671.
4
Self-organized neural maps of human protein sequences.人类蛋白质序列的自组织神经图谱。
Protein Sci. 1994 Mar;3(3):507-21. doi: 10.1002/pro.5560030316.
5
Topological maps of protein sequences.蛋白质序列的拓扑图
Biol Cybern. 1991;65(6):451-8. doi: 10.1007/BF00204658.
6
FuzzyART neural network for protein classification.用于蛋白质分类的模糊ART神经网络。
J Bioinform Comput Biol. 2010 Oct;8(5):825-41. doi: 10.1142/s0219720010004951.
7
Motif identification neural design for rapid and sensitive protein family search.用于快速灵敏蛋白质家族搜索的基序识别神经设计
Comput Appl Biosci. 1996 Apr;12(2):109-18. doi: 10.1093/bioinformatics/12.2.109.
8
Kohonen map as a visualization tool for the analysis of protein sequences: multiple alignments, domains and segments of secondary structures.
Comput Appl Biosci. 1996 Dec;12(6):447-54. doi: 10.1093/bioinformatics/12.6.447.
9
On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
10
Neural networks for molecular sequence classification.用于分子序列分类的神经网络。
Proc Int Conf Intell Syst Mol Biol. 1993;1:429-37.

引用本文的文献

1
SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model.基于自组织映射的 DNA motif 识别的提取算法,具有异构模型。
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S16. doi: 10.1186/1471-2105-12-S1-S16.
2
Self-organizing tree-growing network for the classification of protein sequences.用于蛋白质序列分类的自组织树生长网络
Protein Sci. 1998 Dec;7(12):2613-22. doi: 10.1002/pro.5560071215.
3
Self-organized neural maps of human protein sequences.人类蛋白质序列的自组织神经图谱。
Protein Sci. 1994 Mar;3(3):507-21. doi: 10.1002/pro.5560030316.
4
Topological maps of protein sequences.蛋白质序列的拓扑图
Biol Cybern. 1991;65(6):451-8. doi: 10.1007/BF00204658.