Zhang Jin, Zhang Wenqing, Yang Huijie
Business School, University of Shanghai for Science and Technology, Shanghai, 200093, China.
School of Information Science and Engineering, University of Jinan, Jinan, 250022, China.
J Biol Phys. 2016 Jan;42(1):99-106. doi: 10.1007/s10867-015-9399-7. Epub 2015 Aug 29.
Identification of coding regions in DNA sequences remains challenging. Various methods have been proposed, but these are limited by species-dependence and the need for adequate training sets. The elements in DNA coding regions are known to be distributed in a quasi-random way, while those in non-coding regions have typical similar structures. For short sequences, these statistical characteristics cannot be extracted correctly and cannot even be detected. This paper introduces a new way to solve the problem: balanced estimation of diffusion entropy (BEDE).
识别DNA序列中的编码区域仍然具有挑战性。人们已经提出了各种方法,但这些方法受到物种依赖性和对足够训练集需求的限制。已知DNA编码区域中的元件以准随机方式分布,而非编码区域中的元件具有典型的相似结构。对于短序列,这些统计特征无法正确提取,甚至无法检测到。本文介绍了一种解决该问题的新方法:扩散熵的平衡估计(BEDE)。