Luo Z W, Potokina E, Druka A, Wise R, Waugh R, Kearsey M J
School of Biosciences, University of Birmingham, Birmingham B15 2TT, United Kingdom.
Genetics. 2007 Jun;176(2):789-800. doi: 10.1534/genetics.106.067843. Epub 2007 Apr 3.
The recent development of Affymetrix chips designed from assembled EST sequences has spawned considerable interest in identifying single-feature polymorphisms (SFPs) from transcriptome data. SFPs are valuable genetic markers that potentially offer a physical link to the structural genes themselves. However, most current SFP prediction methodologies were developed for sequenced species although SFPs are particularly valuable for species with complex and unsequenced genomes. To establish the sensitivity and specificity of prediction, we explored four methods for identifying SFPs from experiments involving two tissues in two commercial barleys and their doubled-haploid progeny. The methods were compared in terms of numbers of SFPs predicted and their ability to identify known sequence polymorphisms in the features, to confirm existing SNP genotypes and to match existing maps and individual haplotypes. We identified >4000 separate SFPs that accurately predicted the SNP genotype of >98% of the doubled-haploid (DH) lines. They were highly enriched for features containing sequence polymorphisms but all methods uniformly identified a majority of SFPs ( approximately 64%) in features for which there was no sequence polymorphism while 5% mapped to different locations, indicating that "SFPs" mainly represent polymorphism in cis-acting regulators. All methods are efficient and robust at predicting markers for gene mapping.
最近基于组装的EST序列设计的Affymetrix芯片的发展引发了人们从转录组数据中识别单特征多态性(SFP)的浓厚兴趣。SFP是有价值的遗传标记,可能为结构基因本身提供物理联系。然而,尽管SFP对具有复杂且未测序基因组的物种特别有价值,但目前大多数SFP预测方法是针对已测序物种开发的。为了确定预测的敏感性和特异性,我们探索了四种从涉及两种商业大麦及其双单倍体后代的两个组织的实验中识别SFP的方法。这些方法在预测的SFP数量、识别特征中已知序列多态性的能力、确认现有SNP基因型以及匹配现有图谱和个体单倍型方面进行了比较。我们鉴定出>4000个独立的SFP,这些SFP准确预测了>98%的双单倍体(DH)系的SNP基因型。它们在含有序列多态性的特征中高度富集,但所有方法都一致地在没有序列多态性的特征中鉴定出大多数SFP(约64%),而5%映射到不同位置,这表明“SFP”主要代表顺式作用调节因子中的多态性。所有方法在预测用于基因图谱绘制的标记方面都是高效且稳健的。