Wu Ling-Yun, Zhou Xiaobo, Li Fuhai, Yang Xiaorong, Chang Chung-Che, Wong Stephen T C
Center for Biotechnology and Informatics, Department of Radiology, The Methodist Hospital Research Institute, Weill Medical College, Cornell University, Houston, TX 77030, USA.
Bioinformatics. 2009 Jan 1;25(1):61-7. doi: 10.1093/bioinformatics/btn561. Epub 2008 Oct 29.
Loss of heterozygosity (LOH) is one of the most important mechanisms in the tumor evolution. LOH can be detected from the genotypes of the tumor samples with or without paired normal samples. In paired sample cases, LOH detection for informative single nucleotide polymorphisms (SNPs) is straightforward if there is no genotyping error. But genotyping errors are always unavoidable, and there are about 70% non-informative SNPs whose LOH status can only be inferred from the neighboring informative SNPs.
This article presents a novel LOH inference and segmentation algorithm based on the conditional random pattern (CRP) model. The new model explicitly considers the distance between two neighboring SNPs, as well as the genotyping error rate and the heterozygous rate. This new method is tested on the simulated and real data of the Affymetrix Human Mapping 500K SNP arrays. The experimental results show that the CRP method outperforms the conventional methods based on the hidden Markov model (HMM).
Software is available upon request.
杂合性缺失(LOH)是肿瘤进化中最重要的机制之一。无论有无配对的正常样本,均可从肿瘤样本的基因型中检测到LOH。在配对样本的情况下,如果没有基因分型错误,对信息性单核苷酸多态性(SNP)进行LOH检测很简单。但基因分型错误总是不可避免的,并且约70%的非信息性SNP的LOH状态只能从相邻的信息性SNP推断出来。
本文提出了一种基于条件随机模式(CRP)模型的新型LOH推断和分割算法。新模型明确考虑了两个相邻SNP之间的距离,以及基因分型错误率和杂合率。该新方法在Affymetrix Human Mapping 500K SNP阵列的模拟数据和真实数据上进行了测试。实验结果表明,CRP方法优于基于隐马尔可夫模型(HMM)的传统方法。
可根据要求提供软件。