Genomics and Computational Biology Graduate Group, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Am J Med Genet A. 2013 Sep;161A(9):2134-47. doi: 10.1002/ajmg.a.36038. Epub 2013 Jul 29.
This report describes an algorithm developed to predict the pathogenicity of copy number variants (CNVs) in large sample cohorts. CNVs (genomic deletions and duplications) are found in healthy individuals and in individuals with genetic diagnoses, and differentiation of these two classes of CNVs can be challenging and usually requires extensive manual curation. We have developed PECONPI, an algorithm to assess the pathogenicity of CNVs based on gene content and CNV frequency. This software was applied to a large cohort of patients with genetically heterogeneous non-syndromic hearing loss to score and rank each CNV based on its relative pathogenicity. Of 636 individuals tested, we identified the likely underlying etiology of the hearing loss in 14 (2%) of the patients (1 with a homozygous deletion, 7 with a deletion of a known hearing loss gene and a point mutation on the trans allele and 6 with a deletion larger than 1 Mb). We also identified two probands with smaller deletions encompassing genes that may be functionally related to their hearing loss. The ability of PECONPI to determine the pathogenicity of CNVs was tested on a second genetically heterogeneous cohort with congenital heart defects (CHDs). It successfully identified a likely etiology in 6 of 355 individuals (2%). We believe this tool is useful for researchers with large genetically heterogeneous cohorts to help identify known pathogenic causes and novel disease genes.
本报告描述了一种用于在大样本队列中预测拷贝数变异 (CNV) 致病性的算法。CNV(基因组缺失和重复)在健康个体和具有遗传诊断的个体中都有发现,区分这两类 CNV 具有挑战性,通常需要广泛的人工策管。我们开发了 PECONPI,这是一种基于基因内容和 CNV 频率评估 CNV 致病性的算法。该软件应用于一个具有遗传异质性的非综合征性听力损失的大患者队列,根据其相对致病性对每个 CNV 进行评分和排名。在 636 名接受测试的个体中,我们在 14 名(2%)患者中确定了听力损失的潜在病因(1 名患者为纯合性缺失,7 名患者为已知听力损失基因缺失和反义突变点,6 名患者缺失大于 1Mb)。我们还发现了两个包含可能与其听力损失相关的功能基因的较小缺失的先证者。PECONPI 确定 CNV 致病性的能力在第二个具有先天性心脏缺陷 (CHD) 的遗传异质性队列中进行了测试。它成功地在 355 名个体中的 6 名(2%)中确定了可能的病因。我们相信,对于具有大型遗传异质性队列的研究人员来说,该工具有助于识别已知的致病性原因和新的疾病基因。