Rajapakse Jagath C, Ho Loi Sy
BioInformatics Research Center, School of Computer Engineering, Nanyang Technological University, Singapore 639798.
IEEE/ACM Trans Comput Biol Bioinform. 2005 Apr-Jun;2(2):131-42. doi: 10.1109/TCBB.2005.27.
We present a technique to encode the inputs to neural networks for the detection of signals in genomic sequences. The encoding is based on lower-order Markov models which incorporate known biological characteristics in genomic sequences. The neural networks then learn intrinsic higher-order dependencies of nucleotides at the signal sites. We demonstrate the efficacy of the Markov encoding method in the detection of three genomic signals, namely, splice sites, transcription start sites, and translation initiation sites.
我们提出了一种对神经网络的输入进行编码的技术,用于检测基因组序列中的信号。这种编码基于低阶马尔可夫模型,该模型纳入了基因组序列中已知的生物学特征。然后,神经网络学习信号位点处核苷酸的内在高阶依赖性。我们证明了马尔可夫编码方法在检测三种基因组信号(即剪接位点、转录起始位点和翻译起始位点)方面的有效性。