George Andrew W
Mathematical and Information Sciences, CSIRO, Brisbane, QLD 4067, Australia.
Theor Appl Genet. 2009 Aug;119(3):483-96. doi: 10.1007/s00122-009-1054-x. Epub 2009 May 18.
Genetic studies in polyploid plants rely heavily on the collection of data from dominant marker loci. A dominant marker locus is a locus for which only the presence or absence of an observable (dominant) allele is recorded. Before these marker loci can be used for genetic exploration, the number of copies of a dominant allele carried by a parent (copy number) must be determined for each marker locus. Copy number in polyploids is estimated using a hypothesis testing procedure. The performance of this estimation procedure has never been evaluated. In this paper, I quantify whether the highly sought after single-copy markers can be accurately identified, if the performance of the estimation procedure improves with increasing sample size, and whether the estimation procedure is capable of accurately estimating the copy number of high copy markers. I found that the probability of incorrectly estimating copy number is quite low and that more data can actually reduce the accuracy of the estimation procedure when the testing assumptions are violated. Fortunately, when a significant result is obtained, it is almost always correct. The challenge often is in obtaining a significant result.
多倍体植物的遗传研究在很大程度上依赖于从显性标记位点收集数据。显性标记位点是指仅记录可观察到的(显性)等位基因存在与否的位点。在这些标记位点可用于遗传探索之前,必须为每个标记位点确定亲本携带的显性等位基因的拷贝数(拷贝数)。多倍体中的拷贝数通过假设检验程序进行估计。该估计程序的性能从未得到评估。在本文中,我量化了是否能够准确识别备受追捧的单拷贝标记,估计程序的性能是否会随着样本量的增加而提高,以及该估计程序是否能够准确估计高拷贝标记的拷贝数。我发现错误估计拷贝数的概率相当低,并且当测试假设被违反时,更多的数据实际上会降低估计程序的准确性。幸运的是,当获得显著结果时,它几乎总是正确的。挑战往往在于获得显著结果。