Division of Human Genetics, The Children’s Hospital of Philadelphia, Philadelphia, PA, USA.
Hum Mol Genet. 2011 Mar 1;20(5):880-93. doi: 10.1093/hmg/ddq527. Epub 2010 Dec 8.
Rare copy number variations (CNVs) are a recognized cause of common human disease. Predicting the genetic element(s) within a small CNV whose copy number loss or gain underlies a specific phenotype might be achieved reasonably rapidly for single patients. Identifying the biological processes that are commonly disrupted within a large patient cohort which possess larger CNVs, however, requires a more objective approach that exploits genomic resources. In this study, we first identified 98 large, rare CNVs within patients exhibiting multiple congenital anomalies. All patients presented with global developmental delay (DD), while other secondary symptoms such as cardiac defects, craniofacial features and seizures were varyingly presented. By applying a robust statistical procedure that matches patients' clinical phenotypes to laboratory mouse gene knockouts, we were able to strongly implicate anomalies in brain morphology and, separately, in long-term potentiation as manifestations of these DD patients' disorders. These and other significantly enriched model phenotypes provide insights into the pathoetiology of human DD and behavioral and anatomical secondary symptoms that are specific to DD patients. These enrichments set apart 103 genes, from among thousands overlapped by these CNVs, as strong candidates whose copy number change causally underlies approximately 46% of the cohort's DD syndromes and between 59 and 80% of the cohort's secondary symptoms. We also identified significantly enriched model phenotypes among genes overlapped by CNVs in both DD and learning disability cohorts, indicating a congruent etiology. These results demonstrate the high predictive potential of model organism phenotypes when implicating candidate genes for rare genomic disorders.
罕见的拷贝数变异(CNVs)是常见人类疾病的公认原因。对于单个患者,预测导致特定表型的小 CNV 中缺失或增加的遗传元素可能相对较快地实现。然而,在具有较大 CNV 的大量患者队列中识别共同受到破坏的生物过程需要一种更客观的方法,该方法利用基因组资源。在这项研究中,我们首先在表现出多种先天性异常的患者中鉴定了 98 个大的罕见 CNVs。所有患者均表现出全面发育迟缓(DD),而其他次要症状(如心脏缺陷、颅面特征和癫痫发作)则表现不同。通过应用一种强大的统计程序,将患者的临床表型与实验室小鼠基因敲除相匹配,我们能够强烈暗示大脑形态异常,并且分别暗示长期增强作为这些 DD 患者疾病的表现。这些和其他显著富集的模型表型为人类 DD 的病理生理学以及 DD 患者特有的行为和解剖学次要症状提供了深入了解。这些富集将 103 个基因从数千个与这些 CNVs 重叠的基因中区分出来,这些基因是候选基因,它们的拷贝数变化导致大约 46%的队列 DD 综合征和约 59-80%的队列次要症状。我们还在 DD 和学习障碍队列的 CNVs 重叠基因中鉴定了显著富集的模型表型,表明存在一致的病因。这些结果表明,当涉及罕见基因组疾病的候选基因时,模型生物表型具有很高的预测潜力。