Arndt Peter F, Hwa Terence
Max Planck Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany.
Bioinformatics. 2005 May 15;21(10):2322-8. doi: 10.1093/bioinformatics/bti376. Epub 2005 Mar 15.
Neighbor-dependent substitution processes generated specific pattern of dinucleotide frequencies in the genomes of most organisms. The CpG-methylation-deamination process is, e.g. a prominent process in vertebrates (CpG effect). Such processes, often with unknown mechanistic origins, need to be incorporated into realistic models of nucleotide substitutions.
Based on a general framework of nucleotide substitutions we developed a method that is able to identify the most relevant neighbor-dependent substitution processes, estimate their relative frequencies and judge their importance in order to be included into the modeling. Starting from a model for neighbor independent nucleotide substitution we successively added neighbor-dependent substitution processes in the order of their ability to increase the likelihood of the model describing given data. The analysis of neighbor-dependent nucleotide substitutions based on repetitive elements found in the genomes of human, zebrafish and fruit fly is presented.
A web server to perform the presented analysis is freely available at: http://evogen.molgen.mpg.de/server/substitution-analysis
依赖邻位的替换过程在大多数生物的基因组中产生了特定的二核苷酸频率模式。例如,CpG甲基化-脱氨过程是脊椎动物中的一个突出过程(CpG效应)。此类过程的机制起源往往不明,需要纳入核苷酸替换的现实模型中。
基于核苷酸替换的通用框架,我们开发了一种方法,该方法能够识别最相关的依赖邻位的替换过程,估计它们的相对频率,并判断它们在建模中的重要性,以便将其纳入建模。从一个非依赖邻位的核苷酸替换模型开始,我们按照增加描述给定数据的模型可能性的能力顺序,依次添加依赖邻位的替换过程。本文展示了基于人类、斑马鱼和果蝇基因组中重复元件的依赖邻位的核苷酸替换分析。
可通过以下网址免费访问执行本文分析的网络服务器:http://evogen.molgen.mpg.de/server/substitution-analysis