Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland.
Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland; University Center for Primary Care and Public Health, University of Lausanne, Lausanne, Switzerland.
Am J Hum Genet. 2022 Nov 3;109(11):2009-2017. doi: 10.1016/j.ajhg.2022.09.011. Epub 2022 Oct 19.
Theory for liability-scale models of the underlying genetic basis of complex disease provides an important way to interpret, compare, and understand results generated from biological studies. In particular, through estimation of the liability-scale heritability (LSH), liability models facilitate an understanding and comparison of the relative importance of genetic and environmental risk factors that shape different clinically important disease outcomes. Increasingly, large-scale biobank studies that link genetic information to electronic health records, containing hundreds of disease diagnosis indicators that mostly occur infrequently within the sample, are becoming available. Here, we propose an extension of the existing liability-scale model theory suitable for estimating LSH in biobank studies of low-prevalence disease. In a simulation study, we find that our derived expression yields lower mean square error (MSE) and is less sensitive to prevalence misspecification as compared to previous transformations for diseases with ≤2% population prevalence and LSH of ≤0.45, especially if the biobank sample prevalence is less than that of the wider population. Applying our expression to 13 diagnostic outcomes of ≤3% prevalence in the UK Biobank study revealed important differences in LSH obtained from the different theoretical expressions that impact the conclusions made when comparing LSH across disease outcomes. This demonstrates the importance of careful consideration for estimation and prediction of low-prevalence disease outcomes and facilitates improved inference of the underlying genetic basis of ≤2% population prevalence diseases, especially where biobank sample ascertainment results in a healthier sample population.
理论为复杂疾病潜在遗传基础的 Liability-scale 模型提供了一种重要的方法来解释、比较和理解生物学研究产生的结果。特别是通过 Liability-scale 遗传度 (LSH) 的估计, Liability 模型有助于理解和比较影响不同临床重要疾病结果的遗传和环境风险因素的相对重要性。越来越多的将遗传信息与电子健康记录相关联的大型生物库研究,包含数百个疾病诊断指标,这些指标在样本中大多数情况下很少发生,现在已经变得可用。在这里,我们提出了一种适用于估计低患病率疾病生物库研究中 LSH 的 Liability-scale 模型理论的扩展。在一项模拟研究中,我们发现与以前的转换相比,我们推导的表达式在患病率指定错误的情况下产生更低的均方误差 (MSE),并且对其的敏感性更低,对于患病率≤2%且 LSH≤0.45 的疾病,尤其是当生物库样本患病率低于更广泛的人群时。将我们的表达式应用于 UK Biobank 研究中 13 个患病率≤3%的诊断结果,揭示了不同理论表达式之间获得的 LSH 的重要差异,这会影响比较不同疾病结果之间的 LSH 时得出的结论。这证明了在低患病率疾病结果的估计和预测方面需要仔细考虑的重要性,并有助于改善对≤2%人群患病率疾病潜在遗传基础的推断,特别是在生物库样本确定导致更健康的样本人群的情况下。