Forrest Iain S, Huang Kuan-Lin, Eggington Julie M, Chung Wendy K, Jordan Daniel M, Do Ron
The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Medical Scientist Training Program, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Nat Genet. 2025 Jun 23. doi: 10.1038/s41588-025-02212-3.
Understanding the disease risk of genetic variants is fundamental to precision medicine. Estimates of penetrance-the probability of disease for individuals with a variant allele-rely on disease-specific cohorts, clinical testing and emerging electronic health record (EHR)-linked biobanks. These data sources, while valuable, each have limitations in quality, representativeness and analyzability. Here, we provide a historical account of the currently accepted pathogenicity classification system and data available in ClinVar, a public archive that aggregates variant interpretations but lacks detailed data for accurate penetrance assessment, highlighting its oversimplification of disease risk. We propose an integrative Bayesian framework that unifies pathogenicity and penetrance, leveraging both functional and real-world evidence to refine risk predictions. In addition, we advocate for enhancing ClinVar with the inclusion of high-priority phenotypes, age-stratified data and population-based cohorts linked to EHRs. We suggest developing a community repository of population-based penetrance estimates to support the clinical application of genetic data.
了解基因变异的疾病风险是精准医学的基础。外显率估计——即携带变异等位基因个体患疾病的概率——依赖于疾病特异性队列、临床检测以及新兴的与电子健康记录(EHR)相关联的生物样本库。这些数据来源虽有价值,但在质量、代表性和可分析性方面均存在局限性。在此,我们对当前公认的致病性分类系统以及ClinVar中可用的数据进行了历史性描述。ClinVar是一个公共档案库,汇总了变异解读,但缺乏用于准确外显率评估的详细数据,我们强调了其对疾病风险的过度简化。我们提出了一个综合贝叶斯框架,该框架统一了致病性和外显率,利用功能证据和实际证据来优化风险预测。此外,我们主张通过纳入高优先级表型、年龄分层数据以及与EHR相关联的基于人群的队列来增强ClinVar。我们建议开发一个基于人群的外显率估计社区知识库,以支持基因数据的临床应用。