Arca Mariangela, Mary-Huard Tristan, Gouesnard Brigitte, Bérard Aurélie, Bauland Cyril, Combes Valérie, Madur Delphine, Charcosset Alain, Nicolas Stéphane D
Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, Gif-sur-Yvette, France.
AGAP, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France.
Front Plant Sci. 2021 Jan 7;11:568699. doi: 10.3389/fpls.2020.568699. eCollection 2020.
Genebanks harbor original landraces carrying many original favorable alleles for mitigating biotic and abiotic stresses. Their genetic diversity remains, however, poorly characterized due to their large within genetic diversity. We developed a high-throughput, cheap and labor saving DNA bulk approach based on single-nucleotide polymorphism (SNP) Illumina Infinium HD array to genotype landraces. Samples were gathered for each landrace by mixing equal weights from young leaves, from which DNA was extracted. We then estimated allelic frequencies in each DNA bulk based on fluorescent intensity ratio (FIR) between two alleles at each SNP using a two step-approach. We first tested either whether the DNA bulk was monomorphic or polymorphic according to the two FIR distributions of individuals homozygous for allele A or B, respectively. If the DNA bulk was polymorphic, we estimated its allelic frequency by using a predictive equation calibrated on FIR from DNA bulks with known allelic frequencies. Our approach: (i) gives accurate allelic frequency estimations that are highly reproducible across laboratories, (ii) protects against false detection of allele fixation within landraces. We estimated allelic frequencies of 23,412 SNPs in 156 landraces representing American and European maize diversity. Modified Roger's genetic Distance between 156 landraces estimated from 23,412 SNPs and 17 simple sequence repeats using the same DNA bulks were highly correlated, suggesting that the ascertainment bias is low. Our approach is affordable, easy to implement and does not require specific bioinformatics support and laboratory equipment, and therefore should be highly relevant for large-scale characterization of genebanks for a wide range of species.
基因库保存着许多携带应对生物和非生物胁迫的原始有利等位基因的原始地方品种。然而,由于其内部遗传多样性较大,其遗传多样性仍未得到很好的表征。我们基于单核苷酸多态性(SNP)Illumina Infinium HD芯片开发了一种高通量、低成本且节省劳动力的DNA混合方法,用于对地方品种进行基因分型。通过混合等量的幼叶样本为每个地方品种收集样本,从中提取DNA。然后,我们采用两步法,根据每个SNP位点两个等位基因之间的荧光强度比(FIR)来估计每个DNA混合样本中的等位基因频率。我们首先根据分别对应等位基因A或B纯合个体的两个FIR分布,测试DNA混合样本是单态还是多态。如果DNA混合样本是多态的,我们使用根据已知等位基因频率的DNA混合样本的FIR校准的预测方程来估计其等位基因频率。我们的方法:(i)能给出准确的等位基因频率估计值,且在不同实验室间具有高度可重复性,(ii)可防止在地方品种内错误检测到等位基因固定。我们估计了代表美国和欧洲玉米多样性的156个地方品种中23412个SNP的等位基因频率。使用相同的DNA混合样本,根据23412个SNP和17个简单序列重复估计的156个地方品种之间的改良罗杰遗传距离高度相关,表明确定偏差较低。我们的方法成本低、易于实施,不需要特定的生物信息学支持和实验室设备,因此对于广泛物种基因库的大规模表征应该具有高度相关性。