Suppr超能文献

对秀丽隐杆线虫、黑腹果蝇和酿酒酵母基因组中不同周期长度的相移进行研究。

Investigation of phase shifts for different period lengths in the genomes of C. elegans, D. melanogaster and S. cerevisiae.

作者信息

Pugacheva Valentina, Frenkel Felix, Korotkov Eugene

机构信息

Bioengineering Centre of Russian Academy of Science, Moscow 117312, Russia.

Bioengineering Centre of Russian Academy of Science, Moscow 117312, Russia.

出版信息

Comput Biol Chem. 2014 Aug;51:12-21. doi: 10.1016/j.compbiolchem.2014.03.004. Epub 2014 Apr 13.

Abstract

We describe a new mathematical method for finding very diverged short tandem repeats containing a single indel. The method involves comparison of two frequency matrices: a first matrix for a subsequence before shift and a second one for a subsequence after it. A measure of comparison is based on matrix similarity. The approach developed was applied to analysis of the genomes of Caenorhabditis elegans, Drosophila melanogaster and Saccharomyces cerevisiae. They were investigated regarding the presence of tandem repeats having repeat length equal to 2 - 11 nucleotides except equal to 3, 6 and 9 nucleotides. A number of phase shift regions for these genomes was approximately 2.2 × 10(4), 1.5 × 10(4) and 1.7 × 10(2), respectively. Type I error was less than 5%. The mean length of fuzzy periodicity and phase shift regions was about 220 nucleotides. The regions of fuzzy periodicity having single insertion or deletion occupy substantial parts of the genomes: 5%, 3% and 0.3%, respectively. Only less than 10% of these regions have been detected previously. That is, the number of such regions in the genomes of C. elegans, D. melanogaster and S. cerevisiae is dramatically higher than it has been revealed by any known methods. We suppose that some found regions of fuzzy periodicity could be the regions for protein binding.

摘要

我们描述了一种新的数学方法,用于寻找包含单个插入或缺失的高度发散的短串联重复序列。该方法涉及比较两个频率矩阵:一个是移位前子序列的第一个矩阵,另一个是移位后子序列的第二个矩阵。比较的度量基于矩阵相似性。所开发的方法应用于秀丽隐杆线虫、黑腹果蝇和酿酒酵母基因组的分析。研究了它们是否存在重复长度等于2至11个核苷酸(3、6和9个核苷酸除外)的串联重复序列。这些基因组的相移区域数量分别约为2.2×10⁴、1.5×10⁴和1.7×10²。I型错误小于5%。模糊周期性和相移区域的平均长度约为220个核苷酸。具有单个插入或缺失的模糊周期性区域占据基因组的相当大部分:分别为5%、3%和0.3%。这些区域中只有不到10%以前被检测到。也就是说,秀丽隐杆线虫、黑腹果蝇和酿酒酵母基因组中此类区域的数量比任何已知方法所揭示的要高得多。我们推测一些发现的模糊周期性区域可能是蛋白质结合区域。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验