Suppr超能文献

在杂交物种苹果(Malus × domestica)中整合Infinium和Axiom SNP芯片数据以及看似不兼容调用的原因。

Integration of Infinium and Axiom SNP array data in the outcrossing species Malus × domestica and causes for seemingly incompatible calls.

作者信息

Howard Nicholas P, Troggio Michela, Durel Charles-Eric, Muranty Hélène, Denancé Caroline, Bianco Luca, Tillman John, van de Weg Eric

机构信息

Institut für Biologie und Umweltwissenschaften, Carl von Ossietzky Univ., Oldenburg, Germany.

Department of Horticultural Science, Univ. of Minnesota, St Paul, USA.

出版信息

BMC Genomics. 2021 Apr 7;22(1):246. doi: 10.1186/s12864-021-07565-7.

Abstract

BACKGROUND

Single nucleotide polymorphism (SNP) array technology has been increasingly used to generate large quantities of SNP data for use in genetic studies. As new arrays are developed to take advantage of new technology and of improved probe design using new genome sequence and panel data, a need to integrate data from different arrays and array platforms has arisen. This study was undertaken in view of our need for an integrated high-quality dataset of Illumina Infinium® 20 K and Affymetrix Axiom® 480 K SNP array data in apple (Malus × domestica). In this study, we qualify and quantify the compatibility of SNP calling, defined as SNP calls that are both accurate and concordant, across both arrays by two approaches. First, the concordance of SNP calls was evaluated using a set of 417 duplicate individuals genotyped on both arrays starting from a set of 10,295 robust SNPs on the Infinium array. Next, the accuracy of the SNP calls was evaluated on additional germplasm (n = 3141) from both arrays using Mendelian inconsistent and consistent errors across thousands of pedigree links. While performing this work, we took the opportunity to evaluate reasons for probe failure and observed discordant SNP calls.

RESULTS

Concordance among the duplicate individuals was on average of 97.1% across 10,295 SNPs. Of these SNPs, 35% had discordant call(s) that were further curated, leading to a final set of 8412 (81.7%) SNPs that were deemed compatible. Compatibility was highly influenced by the presence of alternate probe binding locations and secondary polymorphisms. The impact of the latter was highly influenced by their number and proximity to the 3' end of the probe.

CONCLUSIONS

The Infinium and Axiom SNP array data were mostly compatible. However, data integration required intense data filtering and curation. This work resulted in a workflow and information that may be of use in other data integration efforts. Such an in-depth analysis of array concordance and accuracy as ours has not been previously described in the literature and will be useful in future work on SNP array data integration and interpretation, and in probe/platform development.

摘要

背景

单核苷酸多态性(SNP)阵列技术已越来越多地用于生成大量SNP数据,以用于基因研究。随着新阵列的开发,以利用新技术以及使用新的基因组序列和面板数据改进探针设计,出现了整合来自不同阵列和阵列平台数据的需求。鉴于我们需要苹果(Malus×domestica)中Illumina Infinium®20K和Affymetrix Axiom®480K SNP阵列数据的高质量整合数据集,开展了本研究。在本研究中,我们通过两种方法对两个阵列之间SNP调用的兼容性进行了鉴定和量化,SNP调用定义为准确且一致的SNP调用。首先,从Infinium阵列上的一组10295个可靠SNP开始,使用在两个阵列上进行基因分型的一组417个重复个体评估SNP调用的一致性。接下来,使用跨越数千个谱系链接的孟德尔不一致和一致错误,对来自两个阵列的其他种质(n = 3141)的SNP调用准确性进行评估。在进行这项工作时,我们借此机会评估探针失败和观察到的不一致SNP调用的原因。

结果

在10295个SNP中,重复个体之间的一致性平均为97.1%。在这些SNP中,35%有不一致的调用,对其进行了进一步整理,最终得到一组8412个(81.7%)被认为兼容的SNP。兼容性受到替代探针结合位点和二级多态性的高度影响。后者的影响受到其数量及其与探针3'端接近程度的高度影响。

结论

Infinium和Axiom SNP阵列数据大多兼容。然而,数据整合需要大量的数据过滤和整理。这项工作产生了一个工作流程和信息,可能对其他数据整合工作有用。像我们这样对阵列一致性和准确性进行的深入分析在以前的文献中尚未有描述,将对未来SNP阵列数据整合和解释以及探针/平台开发的工作有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83eb/8028180/9e7752feab55/12864_2021_7565_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验