Jacquin Laval, Cao Tuong-Vi, Grenier Cécile, Ahmadi Nourollah
CIRAD, UMR AGAP, Centre de Coopération Internationale en Recherche Agronomique pour le Développement, Avenue Agropolis, Montpellier Cedex 5, 34398, France.
BMC Bioinformatics. 2015 Dec 3;16:404. doi: 10.1186/s12859-015-0830-7.
Numerous simulation tools based on specific assumptions have been proposed to simulate populations. Here we present a simulation tool named DHOEM (densification of haplotypes by loess regression and maximum likelihood) which is free from population assumptions and simulates new markers in real SNP marker data. The main objective of DHOEM is to generate a new population, which incorporates real and simulated SNP by statistical learning from an initial population, which match the realized features of the latter.
To demonstrate DHOEM's abilities, we used a sample of 704 haplotypes for 12 chromosomes with 8336 SNP from a synthetic population, used for breeding upland rice in Latin America. The distributions of allele frequencies, pairwise SNP LD coefficients and data structures, before and after marker densification of the associated marker data set, were shown to be in relatively good agreement at moderate degrees of marker densification. DHOEM is a user-friendly tool that allows the user to specify the level of marker density desired, with a user defined minor allele frequency (MAF) limit, which is produced in a reasonable computation time.
DHOEM is a user-friendly and useful tool for simulation and methodological studies in quantitative genetics and breeding.
已经提出了许多基于特定假设的模拟工具来模拟群体。在此,我们展示一种名为DHOEM(通过局部加权回归和最大似然法进行单倍型致密化)的模拟工具,它不受群体假设的限制,并且能在真实的单核苷酸多态性(SNP)标记数据中模拟新的标记。DHOEM的主要目标是通过从初始群体进行统计学习来生成一个新的群体,该群体整合了真实的和模拟的SNP,并且与初始群体的实际特征相匹配。
为了展示DHOEM的能力,我们使用了来自一个合成群体的704个单倍型样本,这些单倍型对应12条染色体上的8336个SNP,该合成群体用于拉丁美洲的陆稻育种。在相关标记数据集的标记致密化前后,等位基因频率、成对SNP连锁不平衡系数和数据结构的分布在中等程度的标记致密化情况下显示出相对较好的一致性。DHOEM是一个用户友好的工具,它允许用户指定所需的标记密度水平,并设置用户定义的最小等位基因频率(MAF)限制,且能在合理的计算时间内生成结果。
DHOEM是一个用户友好且有用的工具,可用于数量遗传学和育种的模拟及方法学研究。