Presson Angela P, Sobel Eric M, Pajukanta Paivi, Plaisier Christopher, Weeks Daniel E, Aberg Karolina, Papp Jeanette C
Department of Human Genetics, University of California, Los Angeles, CA 90095, USA.
BMC Bioinformatics. 2008 Jul 21;9:317. doi: 10.1186/1471-2105-9-317.
Correctly merged data sets that have been independently genotyped can increase statistical power in linkage and association studies. However, alleles from microsatellite data sets genotyped with different experimental protocols or platforms cannot be accurately matched using base-pair size information alone. In a previous publication we introduced a statistical model for merging microsatellite data by matching allele frequencies between data sets. These methods are implemented in our software MicroMerge version 1 (v1). While MicroMerge v1 output can be analyzed by some genetic analysis programs, many programs can not analyze alignments that do not match alleles one-to-one between data sets. A consequence of such alignments is that codominant genotypes must often be analyzed as phenotypes. In this paper we describe several extensions that are implemented in MicroMerge version 2 (v2).
Notably, MicroMerge v2 includes a new one-to-one alignment option that creates merged pedigree and locus files that can be handled by most genetic analysis software. Other features in MicroMerge v2 enhance the following aspects of control: 1) optimizing the algorithm for different merging scenarios, such as data sets with very different sample sizes or multiple data sets, 2) merging small data sets when a reliable set of allele frequencies are available, and 3) improving the quantity and 4) quality of merged data. We present results from simulated and real microsatellite genotype data sets, and conclude with an association analysis of three familial dyslipidemia (FD) study samples genotyped at different laboratories. Independent analysis of each FD data set did not yield consistent results, but analysis of the merged data sets identified strong association at locus D11S2002.
The MicroMerge v2 features will enable merging for a variety of genotype data sets, which in turn will facilitate meta-analyses for powering association analysis.
经过独立基因分型且正确合并的数据集能够提高连锁分析和关联研究的统计效能。然而,使用不同实验方案或平台进行基因分型得到的微卫星数据集的等位基因,仅靠碱基对大小信息无法准确匹配。在之前的一篇论文中,我们介绍了一种通过匹配数据集之间的等位基因频率来合并微卫星数据的统计模型。这些方法已在我们的软件MicroMerge版本1(v1)中实现。虽然MicroMerge v1的输出可以被一些基因分析程序分析,但许多程序无法分析数据集之间等位基因并非一一对应的比对结果。这种比对的一个后果是,共显性基因型常常必须作为表型来分析。在本文中,我们描述了在MicroMerge版本2(v2)中实现的几个扩展功能。
值得注意的是,MicroMerge v2包含一个新的一对一比对选项,该选项可创建能被大多数基因分析软件处理的合并家系和基因座文件。MicroMerge v2的其他功能在以下控制方面得到了增强:1)针对不同的合并场景优化算法,如样本量差异很大的数据集或多个数据集;2)在有可靠的等位基因频率集时合并小数据集;3)提高合并数据的数量;4)提高合并数据的质量。我们展示了模拟和真实微卫星基因型数据集的结果,并以对在不同实验室进行基因分型的三个家族性血脂异常(FD)研究样本的关联分析作为结论。对每个FD数据集进行独立分析未得出一致结果,但对合并后的数据集进行分析在基因座D11S2002处发现了强关联。
MicroMerge v2的功能将能够对各种基因型数据集进行合并,这反过来将有助于进行荟萃分析以增强关联分析的效能。