Suppr超能文献

用于核糖体RNA序列系统发育分类的反向传播和反向传播神经网络。

Back-propagation and counter-propagation neural networks for phylogenetic classification of ribosomal RNA sequences.

作者信息

Wu C, Shivakumar S

机构信息

Department of Epidemiology/Biomathematics, University of Texas Health Center at Tyler 75710.

出版信息

Nucleic Acids Res. 1994 Oct 11;22(20):4291-9. doi: 10.1093/nar/22.20.4291.

Abstract

A neural network system has been developed for rapid and accurate classification of ribosomal RNA sequences according to phylogenetic relationship. The molecular sequences are encoded into neural input vectors using an n-gram hashing method. A SVD (singular value decomposition) method is used to compress and reduce the size of long and sparse n-gram input vectors. The neural networks used are three-layered, feed-forward networks that employ supervised learning paradigms, including the back-propagation algorithm and a modified counter-propagation algorithm. A pedagogical pattern selection strategy is used to reduce the training time. After trained with ribosomal RNA sequences of the RDP (Ribosomal Database Project) database, the system can classify query sequences into more than one hundred phylogenetic classes with a 100% accuracy at a rate of less than 0.3 CPU second per sequence on a workstation. When compared to other sequence similarity search methods, including Similarity Rank, Blast and Fasta, the neural network method has a higher classification accuracy at a speed of about an order of magnitude faster. The software tool will be made available to the biology community, and the system may be extended into a gene identification system for classifying indiscriminately sequenced DNA fragments.

摘要

已开发出一种神经网络系统,用于根据系统发育关系对核糖体RNA序列进行快速准确的分类。使用n元语法哈希方法将分子序列编码为神经输入向量。奇异值分解(SVD)方法用于压缩和减小长而稀疏的n元语法输入向量的大小。所使用的神经网络是三层前馈网络,采用监督学习范式,包括反向传播算法和改进的对向传播算法。采用一种教学模式选择策略来减少训练时间。在用核糖体数据库项目(RDP)数据库的核糖体RNA序列进行训练后,该系统能够在工作站上以每秒每个序列小于0.3个CPU秒的速度将查询序列分类到一百多个系统发育类别中,准确率达到100%。与其他序列相似性搜索方法(包括相似性排名、Blast和Fasta)相比,神经网络方法在速度快约一个数量级的情况下具有更高的分类准确率。该软件工具将提供给生物学界,并且该系统可能会扩展为一个基因识别系统,用于对未经区分测序的DNA片段进行分类。

相似文献

3
Protein classification artificial neural system.蛋白质分类人工神经系统。
Protein Sci. 1992 May;1(5):667-77. doi: 10.1002/pro.5560010512.
5
The Ribosomal Database Project.核糖体数据库项目
Nucleic Acids Res. 1994 Sep;22(17):3485-7. doi: 10.1093/nar/22.17.3485.
10
The Ribosomal Database Project (RDP).核糖体数据库项目(RDP)。
Nucleic Acids Res. 1996 Jan 1;24(1):82-5. doi: 10.1093/nar/24.1.82.

引用本文的文献

7
Phylogenetic analysis of the bacterial communities in marine sediments.海洋沉积物中细菌群落的系统发育分析。
Appl Environ Microbiol. 1996 Nov;62(11):4049-59. doi: 10.1128/aem.62.11.4049-4059.1996.

本文引用的文献

1
The PIR-International databases.PIR国际数据库。
Nucleic Acids Res. 1993 Jul 1;21(13):3089-92. doi: 10.1093/nar/21.13.3089.
2
The ribosomal database project.核糖体数据库项目
Nucleic Acids Res. 1993 Jul 1;21(13):3021-3. doi: 10.1093/nar/21.13.3021.
5
Improved tools for biological sequence comparison.用于生物序列比较的改进工具。
Proc Natl Acad Sci U S A. 1988 Apr;85(8):2444-8. doi: 10.1073/pnas.85.8.2444.
7
Bacterial evolution.细菌进化
Microbiol Rev. 1987 Jun;51(2):221-71. doi: 10.1128/mr.51.2.221-271.1987.
9
Basic local alignment search tool.基本局部比对搜索工具
J Mol Biol. 1990 Oct 5;215(3):403-10. doi: 10.1016/S0022-2836(05)80360-2.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验