Division of Livestock Sciences (NUWI), University of Natural Resources and Life Sciences, Gregor-Mendel Strasse 33, 1180, Vienna, Austria.
Lilongwe University of Agriculture and Natural Resources, P. O. Box 219, Lilongwe, Malawi.
Genet Sel Evol. 2018 Aug 22;50(1):43. doi: 10.1186/s12711-018-0414-x.
Runs of homozygosity (ROH) islands are stretches of homozygous sequence in the genome of a large proportion of individuals in a population. Algorithms for the detection of ROH depend on the similarity of haplotypes. Coverage gaps and copy number variants (CNV) may result in incorrect identification of such similarity, leading to the detection of ROH islands where none exists. Misidentified hemizygous regions will also appear as homozygous based on sequence variation alone. Our aim was to identify ROH islands influenced by marker coverage gaps or CNV, using Illumina BovineHD BeadChip (777 K) single nucleotide polymorphism (SNP) data for Austrian Brown Swiss, Tyrol Grey and Pinzgauer cattle.
ROH were detected using clustering, and ROH islands were determined from population inbreeding levels for each marker. CNV were detected using a multivariate copy number analysis method and a hidden Markov model. SNP coverage gaps were defined as genomic regions with intermarker distances on average longer than 9.24 kb. ROH islands that overlapped CNV regions (CNVR) or SNP coverage gaps were considered as potential artefacts. Permutation tests were used to determine if overlaps between CNVR with copy losses and ROH islands were due to chance. Diversity of the haplotypes in the ROH islands was assessed by haplotype analyses.
In Brown Swiss, Tyrol Grey and Pinzgauer, we identified 13, 22, and 24 ROH islands covering 26.6, 389.0 and 35.8 Mb, respectively, and we detected 30, 50 and 71 CNVR derived from CNV by using both algorithms, respectively. Overlaps between ROH islands, CNVR or coverage gaps occurred for 7, 14 and 16 ROH islands, respectively. About 37, 44 and 52% of the ROH islands coverage in Brown Swiss, Tyrol Grey and Pinzgauer, respectively, were affected by copy loss. Intersections between ROH islands and CNVR were small, but significantly larger compared to ROH islands at random locations across the genome, implying an association between ROH islands and CNVR. Haplotype diversity for reliable ROH islands was lower than for ROH islands that intersected with copy loss CNVR.
Our findings show that a significant proportion of the ROH islands in the bovine genome are artefacts due to CNV or SNP coverage gaps.
纯合子区域(ROH)是人群中大部分个体基因组中纯合序列的延伸。ROH 的检测算法依赖于单倍型的相似性。覆盖缺口和拷贝数变异(CNV)可能导致这种相似性的错误识别,从而导致不存在的 ROH 岛的检测。仅基于序列变异,错误识别的半合子区域也将表现为纯合子。我们的目的是使用 Illumina BovineHD BeadChip(777K)单核苷酸多态性(SNP)数据,针对奥地利红牛、蒂罗尔灰色牛和皮恩扎格尔牛,识别受标记覆盖缺口或 CNV 影响的 ROH 岛。
使用聚类检测 ROH,并根据每个标记的群体近亲繁殖水平确定 ROH 岛。使用多元拷贝数分析方法和隐马尔可夫模型检测 CNV。SNP 覆盖缺口定义为平均两标记间距离大于 9.24kb 的基因组区域。与 CNV 区域(CNVR)或 SNP 覆盖缺口重叠的 ROH 岛被认为是潜在的假象。通过置换检验确定与拷贝数缺失的 CNVR 之间的重叠是否是由于偶然原因造成的。通过单倍型分析评估 ROH 岛中单倍型的多样性。
在红牛、蒂罗尔灰色牛和皮恩扎格尔牛中,我们分别鉴定了 13、22 和 24 个覆盖 26.6、389.0 和 35.8Mb 的 ROH 岛,并且分别通过两种算法检测到 30、50 和 71 个由 CNV 衍生的 CNVR。ROH 岛、CNVR 或覆盖缺口之间存在 7、14 和 16 个重叠。红牛、蒂罗尔灰色牛和皮恩扎格尔牛的 ROH 岛覆盖区分别有 37%、44%和 52%受拷贝缺失影响。ROH 岛与 CNVR 之间的交点较小,但与基因组中随机位置的 ROH 岛相比,明显更大,这意味着 ROH 岛与 CNVR 之间存在关联。可靠的 ROH 岛的单倍型多样性低于与拷贝缺失的 CNVR 相交的 ROH 岛。
我们的研究结果表明,牛基因组中相当一部分 ROH 岛是由于 CNV 或 SNP 覆盖缺口造成的假象。