Hota Malaya Kumar, Srivastava Vinay Kumar
Department of Electronics and Communication Engineering, Motilal Nehru National Institute of Technology, Allahabad 211004, Uttar Pradesh, India.
Int J Data Min Bioinform. 2011;5(1):110-27. doi: 10.1504/ijdmb.2011.038580.
In this paper, the performance of various sliding window trigonometric fast transforms for identification of protein coding regions has been analysed at the nucleotide level. It is found that, Short-Time Discrete Fourier Transform (ST-DFT) gives better identification accuracy in comparison with Short-Time Discrete Cosine Transform (ST-DCT), Short-Time Discrete Sine Transform (ST-DST) and Short-Time Discrete Hartley Transform (ST-DHT). In the proposed method, identification accuracy of protein coding regions has been improved by applying Singular Value Decomposition (SVD) on the DNA spectrum obtained using sliding window trigonometric fast transforms. The results show that, in proposed method all trigonometric fast transforms gives almost similar results in terms of area under ROC curve for GENSCAN test set.
在本文中,已在核苷酸水平分析了用于识别蛋白质编码区的各种滑动窗口三角快速变换的性能。研究发现,与短时离散余弦变换(ST-DCT)、短时离散正弦变换(ST-DST)和短时离散哈特利变换(ST-DHT)相比,短时离散傅里叶变换(ST-DFT)具有更高的识别准确率。在所提出的方法中,通过对使用滑动窗口三角快速变换获得的DNA谱应用奇异值分解(SVD),提高了蛋白质编码区的识别准确率。结果表明,在所提出的方法中,就GENSCAN测试集的ROC曲线下面积而言,所有三角快速变换给出的结果几乎相似。