Nicolazzi E L, Biffani S, Biscarini F, Orozco Ter Wengel P, Caprera A, Nazzicari N, Stella A
Fondazione Parco Tecnologico Padano (PTP), Via Einstein, Cascina Codazza, Lodi, 26900, Italy.
Istituto di biologia e biotecnologia Agraria (IBBA-CNR), Consiglio Nazionale delle Ricerche, Via Einstein, Cascina Codazza, Lodi, 26900, Italy.
Anim Genet. 2015 Aug;46(4):343-53. doi: 10.1111/age.12295. Epub 2015 Apr 23.
Since the beginning of the genomic era, the number of available single nucleotide polymorphism (SNP) arrays has grown considerably. In the bovine species alone, 11 SNP chips not completely covered by intellectual property are currently available, and the number is growing. Genomic/genotype data are not standardized, and this hampers its exchange and integration. In addition, software used for the analyses of these data usually requires not standard (i.e. case specific) input files which, considering the large amount of data to be handled, require at least some programming skills in their production. In this work, we describe a software toolkit for SNP array data management, imputation, genome-wide association studies, population genetics and genomic selection. However, this toolkit does not solve the critical need for standardization of the genotypic data and software input files. It only highlights the chaotic situation each researcher has to face on a daily basis and gives some helpful advice on the currently available tools in order to navigate the SNP array data complexity.
自基因组时代开始以来,可用的单核苷酸多态性(SNP)阵列数量大幅增长。仅在牛种中,目前就有11种未完全受知识产权保护的SNP芯片,而且数量还在增加。基因组/基因型数据尚未标准化,这阻碍了其交换和整合。此外,用于分析这些数据的软件通常需要非标准(即特定情况)的输入文件,考虑到要处理的大量数据,在生成这些文件时至少需要一些编程技能。在这项工作中,我们描述了一个用于SNP阵列数据管理、插补、全基因组关联研究、群体遗传学和基因组选择的软件工具包。然而,这个工具包并没有解决对基因型数据和软件输入文件进行标准化的迫切需求。它只是凸显了每个研究人员每天都必须面对的混乱局面,并就当前可用的工具给出了一些有用的建议,以便应对SNP阵列数据的复杂性。