Díaz de Ståhl Teresita, Sandgren Johanna, Piotrowski Arkadiusz, Nord Helena, Andersson Robin, Menzel Uwe, Bogdan Adam, Thuresson Ann-Charlotte, Poplawski Andrzej, von Tell Desiree, Hansson Caisa M, Elshafie Amir I, Elghazali Gehad, Imreh Stephan, Nordenskjöld Magnus, Upadhyaya Meena, Komorowski Jan, Bruder Carl E G, Dumanski Jan P
Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, Uppsala, Sweden.
Hum Mutat. 2008 Mar;29(3):398-408. doi: 10.1002/humu.20659.
To further explore the extent of structural large-scale variation in the human genome, we assessed copy number variations (CNVs) in a series of 71 healthy subjects from three ethnic groups. CNVs were analyzed using comparative genomic hybridization (CGH) to a BAC array covering the human genome, using DNA extracted from peripheral blood, thus avoiding any culture-induced rearrangements. By applying a newly developed computational algorithm based on Hidden Markov modeling, we identified 1,078 autosomal CNVs, including at least two neighboring/overlapping BACs, which represent 315 distinct regions. The average size of the sequence polymorphisms was approximately 350 kb and involved in total approximately 117 Mb or approximately 3.5% of the genome. Gains were about four times more common than deletions, and segmental duplications (SDs) were overrepresented, especially in larger deletion variants. This strengthens the notion that SDs often define hotspots of chromosomal rearrangements. Over 60% of the identified autosomal rearrangements match previously reported CNVs, recognized with various platforms. However, results from chromosome X do not agree well with the previously annotated CNVs. Furthermore, data from single BACs deviating in copy number suggest that our above estimate of total variation is conservative. This report contributes to the establishment of the common baseline for CNV, which is an important resource in human genetics.
为了进一步探究人类基因组中大规模结构变异的程度,我们评估了来自三个种族的71名健康受试者的拷贝数变异(CNV)。使用覆盖人类基因组的BAC阵列,通过比较基因组杂交(CGH)对CNV进行分析,采用从外周血中提取的DNA,从而避免任何培养诱导的重排。通过应用基于隐马尔可夫模型的新开发的计算算法,我们鉴定出1078个常染色体CNV,包括至少两个相邻/重叠的BAC,它们代表315个不同区域。序列多态性的平均大小约为350 kb,总共涉及约117 Mb或约占基因组的3.5%。扩增比缺失常见约四倍,并且片段重复(SD)过度存在,特别是在较大的缺失变异中。这强化了SD常常定义染色体重排热点的观念。超过60%的已鉴定常染色体重排与先前报道的CNV匹配,这些CNV通过各种平台识别。然而,X染色体的结果与先前注释的CNV不太一致。此外,来自拷贝数偏离的单个BAC的数据表明,我们上述对总变异的估计是保守的。本报告有助于建立CNV的共同基线,这是人类遗传学中的一项重要资源。