Beroukhim Rameen, Lin Ming, Park Yuhyun, Hao Ke, Zhao Xiaojun, Garraway Levi A, Fox Edward A, Hochberg Ephraim P, Mellinghoff Ingo K, Hofer Matthias D, Descazeaud Aurelien, Rubin Mark A, Meyerson Matthew, Wong Wing Hung, Sellers William R, Li Cheng
Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA.
PLoS Comput Biol. 2006 May;2(5):e41. doi: 10.1371/journal.pcbi.0020041. Epub 2006 May 12.
Loss of heterozygosity (LOH) of chromosomal regions bearing tumor suppressors is a key event in the evolution of epithelial and mesenchymal tumors. Identification of these regions usually relies on genotyping tumor and counterpart normal DNA and noting regions where heterozygous alleles in the normal DNA become homozygous in the tumor. However, paired normal samples for tumors and cell lines are often not available. With the advent of oligonucleotide arrays that simultaneously assay thousands of single-nucleotide polymorphism (SNP) markers, genotyping can now be done at high enough resolution to allow identification of LOH events by the absence of heterozygous loci, without comparison to normal controls. Here we describe a hidden Markov model-based method to identify LOH from unpaired tumor samples, taking into account SNP intermarker distances, SNP-specific heterozygosity rates, and the haplotype structure of the human genome. When we applied the method to data genotyped on 100 K arrays, we correctly identified 99% of SNP markers as either retention or loss. We also correctly identified 81% of the regions of LOH, including 98% of regions greater than 3 megabases. By integrating copy number analysis into the method, we were able to distinguish LOH from allelic imbalance. Application of this method to data from a set of prostate samples without paired normals identified known regions of prevalent LOH. We have developed a method for analyzing high-density oligonucleotide SNP array data to accurately identify of regions of LOH and retention in tumors without the need for paired normal samples.
携带肿瘤抑制基因的染色体区域杂合性缺失(LOH)是上皮性和间叶性肿瘤发生发展中的关键事件。这些区域的鉴定通常依赖于对肿瘤及相应正常DNA进行基因分型,并记录正常DNA中的杂合等位基因在肿瘤中变为纯合的区域。然而,肿瘤和细胞系的配对正常样本往往难以获得。随着能同时检测数千个单核苷酸多态性(SNP)标记的寡核苷酸阵列的出现,现在可以以足够高的分辨率进行基因分型,从而通过杂合位点的缺失来鉴定LOH事件,而无需与正常对照进行比较。在此,我们描述了一种基于隐马尔可夫模型的方法,用于从未配对的肿瘤样本中鉴定LOH,该方法考虑了SNP标记间的距离、SNP特异性杂合率以及人类基因组的单倍型结构。当我们将该方法应用于在100K阵列上进行基因分型的数据时,我们正确地将99%的SNP标记鉴定为保留或缺失。我们还正确地鉴定出了81%的LOH区域,包括98%大于3兆碱基的区域。通过将拷贝数分析整合到该方法中,我们能够区分LOH和等位基因失衡。将该方法应用于一组无配对正常样本的前列腺样本数据,鉴定出了已知的普遍存在LOH的区域。我们开发了一种分析高密度寡核苷酸SNP阵列数据的方法,无需配对正常样本即可准确鉴定肿瘤中的LOH区域和保留区域。