Department of Mathematics, University of Exeter, North Park Road, Exeter, EX4 4QF, UK.
Living Systems Institute, Centre for Biomedical Modelling and Analysis, University of Exeter, Stocker Road, Exeter, EX4 4QD, UK.
Nat Commun. 2021 Nov 8;12(1):6441. doi: 10.1038/s41467-021-26501-7.
Clinical classification is essential for estimating disease prevalence but is difficult, often requiring complex investigations. The widespread availability of population level genetic data makes novel genetic stratification techniques a highly attractive alternative. We propose a generalizable mathematical framework for determining disease prevalence within a cohort using genetic risk scores. We compare and evaluate methods based on the means of genetic risk scores' distributions; the Earth Mover's Distance between distributions; a linear combination of kernel density estimates of distributions; and an Excess method. We demonstrate the performance of genetic stratification to produce robust prevalence estimates. Specifically, we show that robust estimates of prevalence are still possible even with rarer diseases, smaller cohort sizes and less discriminative genetic risk scores, highlighting the general utility of these approaches. Genetic stratification techniques offer exciting new research tools, enabling unbiased insights into disease prevalence and clinical characteristics unhampered by clinical classification criteria.
临床分类对于估计疾病的流行率至关重要,但却具有难度,通常需要复杂的调查。人群水平的遗传数据的广泛可得使得新颖的遗传分层技术成为极具吸引力的替代方法。我们提出了一种可用于使用遗传风险评分确定队列中疾病流行率的可推广的数学框架。我们比较和评估了基于遗传风险评分分布均值的方法、分布之间的地表距离、分布核密度估计的线性组合以及超额方法。我们证明了遗传分层在产生稳健的流行率估计方面的性能。具体而言,我们表明,即使是罕见疾病、较小的队列规模和较少区分性的遗传风险评分,也仍然可以进行稳健的流行率估计,这突显了这些方法的普遍适用性。遗传分层技术提供了令人兴奋的新研究工具,使人们能够不受临床分类标准的影响,获得对疾病流行率和临床特征的无偏见解。