Departamento de Bioquímica, Genética e Inmunología, Facultad de Biología, Universidade de Vigo, 36310 Vigo, Spain.
BMC Genomics. 2013 Aug 1;14:528. doi: 10.1186/1471-2164-14-528.
Amplified fragment length polymorphism (AFLP) markers are frequently used for a wide range of studies, such as genome-wide mapping, population genetic diversity estimation, hybridization and introgression studies, phylogenetic analyses, and detection of signatures of selection. An important issue to be addressed for some of these fields is the distribution of the markers across the genome, particularly in relation to gene sequences.
Using in-silico restriction fragment analysis of the genomes of nine eukaryotic species we characterise the distribution of AFLP fragments across the genome and, particularly, in relation to gene locations. First, we identify the physical position of markers across the chromosomes of all species. An observed accumulation of fragments around (peri) centromeric regions in some species is produced by repeated sequences, and this accumulation disappears when AFLP bands rather than fragments are considered. Second, we calculate the percentage of AFLP markers positioned within gene sequences. For the typical EcoRI/MseI enzyme pair, this ranges between 28 and 87% and is usually larger than that expected by chance because of the higher GC content of gene sequences relative to intergenic ones. In agreement with this, the use of enzyme pairs with GC-rich restriction sites substantially increases the above percentages. For example, using the enzyme system SacI/HpaII, 86% of AFLP markers are located within gene sequences in A. thaliana, and 100% of markers in Plasmodium falciparun. We further find that for a typical trait controlled by 50 genes of average size, if 1000 AFLPs are used in a study, the number of those within 1 kb distance from any of the genes would be only about 1-2, and only about 50% of the genes would have markers within that distance.
The high coverage of AFLP markers across the genomes and the high proportion of markers within or close to gene sequences make them suitable for genome scans and detecting large islands of differentiation in the genome. However, for specific traits, the percentage of AFLP markers close to genes can be rather small. Therefore, genome scans directed towards the search of markers closely linked to selected loci can be a difficult task in many instances.
扩增片段长度多态性(AFLP)标记常用于广泛的研究,例如全基因组图谱绘制、群体遗传多样性估计、杂交和渐渗研究、系统发育分析以及选择信号的检测。对于其中一些领域,一个重要的问题是标记在基因组中的分布,特别是与基因序列的关系。
通过对九种真核生物基因组的计算机内切酶片段分析,我们描述了 AFLP 片段在基因组中的分布情况,特别是与基因位置的关系。首先,我们确定了所有物种染色体上标记的物理位置。在一些物种中,标记在(peri)着丝粒区域周围的聚集是由重复序列引起的,当考虑 AFLP 带而不是片段时,这种聚集就消失了。其次,我们计算了定位在基因序列内的 AFLP 标记的百分比。对于典型的 EcoRI/MseI 酶对,这个范围在 28%到 87%之间,通常比由于基因序列相对于基因间序列具有更高的 GC 含量而产生的随机预期值要大。这与使用富含 GC 的限制酶对显著增加上述百分比的结果是一致的。例如,在拟南芥中,使用 SacI/HpaII 酶系统,86%的 AFLP 标记位于基因序列内,而恶性疟原虫中 100%的标记位于基因序列内。我们进一步发现,对于一个由 50 个平均大小基因控制的典型性状,如果在研究中使用 1000 个 AFLP,则距离任何一个基因 1kb 以内的标记数量将只有 1-2 个,并且只有大约 50%的基因将在该距离内有标记。
AFLP 标记在基因组中的高覆盖率和在基因内或附近的标记的高比例使得它们适合于基因组扫描和检测基因组中较大的分化岛屿。然而,对于特定的性状,靠近基因的 AFLP 标记的百分比可能相当小。因此,在许多情况下,针对与选定基因座紧密连锁的标记进行的基因组扫描可能是一项艰巨的任务。