• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用自组织神经网络在未比对分子序列中发现基序

Motif discoveries in unaligned molecular sequences using self-organizing neural networks.

作者信息

Liu Derong, Xiong Xiaoxu, DasGupta Bhaskar, Zhang Huaguang

出版信息

IEEE Trans Neural Netw. 2006 Jul;17(4):919-928. doi: 10.1109/TNN.2006.875987.

DOI:10.1109/TNN.2006.875987
PMID:16856655
Abstract

In this paper, we study the problem of motif discoveries in unaligned DNA and protein sequences. The problem of motif identification in DNA and protein sequences has been studied for many years in the literature. Major hurdles at this point include computational complexity and reliability of the search algorithms. We propose a self-organizing neural network structure for solving the problem of motif identification in DNA and protein sequences. Our network contains several layers, with each layer performing classifications at different levels. The top layer divides the input space into a small number of regions and the bottom layer classifies all input patterns into motifs and nonmotif patterns. Depending on the number of input patterns to be classified, several layers between the top layer and the bottom layer are needed to perform intermediate classifications. We maintain a low computational complexity through the use of the layered structure so that each pattern's classification is performed with respect to a small subspace of the whole input space. Our self-organizing neural network will grow as needed (e.g., when more motif patterns are classified). It will give the same amount of attention to each input pattern and will not omit any potential motif patterns. Finally, simulation results show that our algorithm outperforms existing algorithms in certain aspects. In particular, simulation results show that our algorithm can identify motifs with more mutations than existing algorithms. Our algorithm works well for long DNA sequences as well.

摘要

在本文中,我们研究了未比对的DNA和蛋白质序列中的基序发现问题。DNA和蛋白质序列中的基序识别问题在文献中已被研究多年。目前的主要障碍包括计算复杂性和搜索算法的可靠性。我们提出了一种自组织神经网络结构来解决DNA和蛋白质序列中的基序识别问题。我们的网络包含若干层,每层在不同层次上进行分类。顶层将输入空间划分为少量区域,底层将所有输入模式分类为基序模式和非基序模式。根据要分类的输入模式数量,需要在顶层和底层之间设置若干层来进行中间分类。通过使用分层结构,我们保持了较低的计算复杂性,以便针对整个输入空间的一个小子空间对每个模式进行分类。我们的自组织神经网络将根据需要进行扩展(例如,当分类更多的基序模式时)。它将对每个输入模式给予同等程度的关注,不会遗漏任何潜在的基序模式。最后,仿真结果表明,我们的算法在某些方面优于现有算法。特别是,仿真结果表明,我们的算法能够识别比现有算法更多具有突变的基序。我们的算法对于长DNA序列也能很好地发挥作用。

相似文献

1
Motif discoveries in unaligned molecular sequences using self-organizing neural networks.使用自组织神经网络在未比对分子序列中发现基序
IEEE Trans Neural Netw. 2006 Jul;17(4):919-928. doi: 10.1109/TNN.2006.875987.
2
Identification of motifs with insertions and deletions in protein sequences using self-organizing neural networks.
Neural Netw. 2005 Jun-Jul;18(5-6):835-42. doi: 10.1016/j.neunet.2005.06.007.
3
Voting algorithms for the motif finding problem.用于基序查找问题的投票算法。
Comput Syst Bioinformatics Conf. 2008;7:37-47.
4
Finding motifs from all sequences with and without binding sites.从所有具有和不具有结合位点的序列中寻找基序。
Bioinformatics. 2006 Sep 15;22(18):2217-23. doi: 10.1093/bioinformatics/btl371. Epub 2006 Jul 26.
5
A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs with improved estimation of the signal length.一种用于识别具有对称结构、间隔的DNA基序并改进信号长度估计的吉布斯采样器。
Bioinformatics. 2005 May 15;21(10):2240-5. doi: 10.1093/bioinformatics/bti336. Epub 2005 Feb 22.
6
Discriminative motif discovery in DNA and protein sequences using the DEME algorithm.使用DEME算法在DNA和蛋白质序列中发现鉴别性基序。
BMC Bioinformatics. 2007 Oct 15;8:385. doi: 10.1186/1471-2105-8-385.
7
Combining phylogenetic data with co-regulated genes to identify regulatory motifs.结合系统发育数据与共调控基因以识别调控基序。
Bioinformatics. 2003 Dec 12;19(18):2369-80. doi: 10.1093/bioinformatics/btg329.
8
MUSA: a parameter free algorithm for the identification of biologically significant motifs.MUSA:一种用于识别具有生物学意义基序的无参数算法。
Bioinformatics. 2006 Dec 15;22(24):2996-3002. doi: 10.1093/bioinformatics/btl537. Epub 2006 Oct 26.
9
Finding motifs in the twilight zone.在模糊地带寻找基序。
Bioinformatics. 2002 Oct;18(10):1374-81. doi: 10.1093/bioinformatics/18.10.1374.
10
Tsukuba BB: a branch and bound algorithm for local multiple alignment of DNA and protein sequences.筑波BB:一种用于DNA和蛋白质序列局部多重比对的分支定界算法。
J Comput Biol. 2001;8(3):283-303. doi: 10.1089/10665270152530854.

引用本文的文献

1
Image correlation method for DNA sequence alignment.基于图像相关的 DNA 序列比对方法。
PLoS One. 2012;7(6):e39221. doi: 10.1371/journal.pone.0039221. Epub 2012 Jun 27.
2
SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model.基于自组织映射的 DNA motif 识别的提取算法,具有异构模型。
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S16. doi: 10.1186/1471-2105-12-S1-S16.
3
A survey of DNA motif finding algorithms.DNA基序查找算法综述。
BMC Bioinformatics. 2007 Nov 1;8 Suppl 7(Suppl 7):S21. doi: 10.1186/1471-2105-8-S7-S21.