• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于蛋白质序列分类的自组织树生长网络

Self-organizing tree-growing network for the classification of protein sequences.

作者信息

Wang H C, Dopazo J, de la Fraga L G, Zhu Y P, Carazo J M

机构信息

Centro Nacional de Biotecnologia-CSIC, Universidad Autonoma, Madrid, Spain.

出版信息

Protein Sci. 1998 Dec;7(12):2613-22. doi: 10.1002/pro.5560071215.

DOI:10.1002/pro.5560071215
PMID:9865956
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2143887/
Abstract

The self-organizing tree algorithm (SOTA) was recently introduced to construct phylogenetic trees from biological sequences, based on the principles of Kohonen's self-organizing maps and on Fritzke's growing cell structures. SOTA is designed in such a way that the generation of new nodes can be stopped when the sequences assigned to a node are already above a certain similarity threshold. In this way a phylogenetic tree resolved at a high taxonomic level can be obtained. This capability is especially useful to classify sets of diversified sequences. SOTA was originally designed to analyze pre-aligned sequences. It is now adapted to be able to analyze patterns associated to the frequency of residues along a sequence, such as protein dipeptide composition and other n-gram compositions. In this work we show that the algorithm applied to these data is able to not only successfully construct phylogenetic trees of protein families, such as cytochrome c, triosephophate isomerase, and hemoglobin alpha chains, but also classify very diversified sequence data sets, such as a mixture of interleukins and their receptors.

摘要

自组织树算法(SOTA)最近被引入,用于根据科霍宁自组织映射原理和弗里茨克生长细胞结构,从生物序列构建系统发育树。SOTA的设计方式是,当分配给一个节点的序列已经高于某个相似性阈值时,新节点的生成就可以停止。通过这种方式,可以获得在高分类水平上解析的系统发育树。这种能力对于对多样化序列集进行分类特别有用。SOTA最初设计用于分析预比对序列。现在它经过改进,能够分析与序列中残基频率相关的模式,例如蛋白质二肽组成和其他n元组组成。在这项工作中,我们表明应用于这些数据的算法不仅能够成功构建蛋白质家族的系统发育树,如细胞色素c、磷酸丙糖异构酶和血红蛋白α链,还能够对非常多样化的序列数据集进行分类,如白细胞介素及其受体的混合物。

相似文献

1
Self-organizing tree-growing network for the classification of protein sequences.用于蛋白质序列分类的自组织树生长网络
Protein Sci. 1998 Dec;7(12):2613-22. doi: 10.1002/pro.5560071215.
2
On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
3
New approaches to phylogenetic tree search and their application to large numbers of protein alignments.系统发育树搜索的新方法及其在大量蛋白质序列比对中的应用。
Syst Biol. 2007 Oct;56(5):727-40. doi: 10.1080/10635150701611134.
4
A novel method to analyze the similarity of biological sequences.一种分析生物序列相似性的新方法。
J Biomol Struct Dyn. 2009 Apr;26(5):599-608. doi: 10.1080/07391102.2009.10507275.
5
Bayesian coestimation of phylogeny and sequence alignment.系统发育与序列比对的贝叶斯联合估计
BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.
6
SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.SATe-II:一种非常快速且准确的同时估计多个序列比对和系统发育树的方法。
Syst Biol. 2012 Jan;61(1):90-106. doi: 10.1093/sysbio/syr095. Epub 2011 Dec 1.
7
Predicting functional sites with an automated algorithm suitable for heterogeneous datasets.使用适用于异构数据集的自动算法预测功能位点。
BMC Bioinformatics. 2005 May 13;6:116. doi: 10.1186/1471-2105-6-116.
8
pHMM-tree: phylogeny of profile hidden Markov models.pHMM树:轮廓隐马尔可夫模型的系统发育
Bioinformatics. 2017 Apr 1;33(7):1093-1095. doi: 10.1093/bioinformatics/btw779.
9
Phylogenetic reconstruction using an unsupervised growing neural network that adopts the topology of a phylogenetic tree.
J Mol Evol. 1997 Feb;44(2):226-33. doi: 10.1007/pl00006139.
10
UPSEC: an algorithm for classifying unaligned protein sequences into functional families.UPSEC:一种将未比对的蛋白质序列分类到功能家族的算法。
J Comput Biol. 2008 May;15(4):431-43. doi: 10.1089/cmb.2007.0113.

引用本文的文献

1
Transcriptome Analysis Suggests That Chromosome Introgression Fragments from Sea Island Cotton () Increase Fiber Strength in Upland Cotton ().转录组分析表明,海岛棉的染色体渐渗片段提高了陆地棉的纤维强度。
G3 (Bethesda). 2017 Oct 5;7(10):3469-3479. doi: 10.1534/g3.117.300108.
2
A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods.一种通过机器学习与生物信息学方法相结合,利用蛋白质编码和非编码 DNA 条码进行物种鉴定的新方法。
PLoS One. 2012;7(2):e30986. doi: 10.1371/journal.pone.0030986. Epub 2012 Feb 20.

本文引用的文献

1
Hematopoietin sub-family classification based on size, gene organization and sequence homology.基于大小、基因结构和序列同源性的造血因子亚家族分类。
Curr Biol. 1993 Sep 1;3(9):573-81. doi: 10.1016/0960-9822(93)90002-6.
2
Artificial neural networks for molecular sequence analysis.用于分子序列分析的人工神经网络
Comput Chem. 1997;21(4):237-56. doi: 10.1016/s0097-8485(96)00038-1.
3
Classification of protein families and detection of the determinant residues with an improved self-organizing map.利用改进的自组织映射对蛋白质家族进行分类并检测决定簇残基。
Biol Cybern. 1997 Jun;76(6):441-50. doi: 10.1007/s004220050357.
4
Phylogenetic reconstruction using an unsupervised growing neural network that adopts the topology of a phylogenetic tree.
J Mol Evol. 1997 Feb;44(2):226-33. doi: 10.1007/pl00006139.
5
Kohonen map as a visualization tool for the analysis of protein sequences: multiple alignments, domains and segments of secondary structures.
Comput Appl Biosci. 1996 Dec;12(6):447-54. doi: 10.1093/bioinformatics/12.6.447.
6
Motif identification neural design for rapid and sensitive protein family search.用于快速灵敏蛋白质家族搜索的基序识别神经设计
Comput Appl Biosci. 1996 Apr;12(2):109-18. doi: 10.1093/bioinformatics/12.2.109.
7
The evolution of haematopoietic cytokine/receptor complexes.
Cytokine. 1995 Oct;7(7):679-88. doi: 10.1006/cyto.1995.0080.
8
Evolution of hemopoietic ligands and their receptors. Influence of positive selection on correlated replacements throughout ligand and receptor proteins.
J Immunol. 1996 Feb 1;156(3):1062-70.
9
The hematopoietin receptor superfamily.造血因子受体超家族。
Cytokine. 1993 Mar;5(2):95-106. doi: 10.1016/1043-4666(93)90047-9.
10
X-ray structure of interleukin-1 receptor antagonist at 2.0-A resolution.分辨率为2.0埃的白细胞介素-1受体拮抗剂的X射线结构。
J Biol Chem. 1994 Apr 29;269(17):12874-9.