Souaiaia Tade, Wu Hei Man, Hoggart Clive, O'Reilly Paul F
Department of Cellular Biology, SUNY Downstate Health Sciences, Brooklyn, United States.
Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York, United States.
Elife. 2025 Jan 8;12:RP87522. doi: 10.7554/eLife.87522.
The use of siblings to infer the factors influencing complex traits has been a cornerstone of quantitative genetics. Here, we utilise siblings for a novel application: the inference of genetic architecture, specifically that relating to individuals with extreme trait values (e.g. in the top 1%). Inferring the genetic architecture most relevant to this group of individuals is important because they are at the greatest risk of disease and may be more likely to harbour rare variants of large effect due to natural selection. We develop a theoretical framework that derives expected distributions of sibling trait values based on an index sibling's trait value, estimated trait heritability, and null assumptions that include infinitesimal genetic effects and environmental factors that are either controlled for or have combined Gaussian effects. This framework is then used to develop statistical tests powered to distinguish between trait tails characterised by common polygenic architecture from those that include substantial enrichments of de novo or rare variant (Mendelian) architecture. We apply our tests to UK Biobank data here, although we note that they can be used to infer genetic architecture in any cohort or health registry that includes siblings and their trait values, since these tests do not use genetic data. We describe how our approach has the potential to help disentangle the genetic and environmental causes of extreme trait values, and to improve the design and power of future sequencing studies to detect rare variants.
利用兄弟姐妹来推断影响复杂性状的因素一直是数量遗传学的基石。在此,我们将兄弟姐妹用于一种新的应用:推断遗传结构,特别是与具有极端性状值的个体(例如处于前1%)相关的遗传结构。推断与这组个体最相关的遗传结构很重要,因为他们患疾病的风险最大,并且由于自然选择,可能更有可能携带具有大效应的罕见变异。我们开发了一个理论框架,该框架基于一个索引兄弟姐妹的性状值、估计的性状遗传力以及包括无穷小遗传效应和已控制或具有组合高斯效应的环境因素的零假设,推导出兄弟姐妹性状值的预期分布。然后,该框架用于开发统计检验,以区分以常见多基因结构为特征的性状尾部与那些包含大量新生或罕见变异(孟德尔)结构富集的性状尾部。我们在此将我们的检验应用于英国生物银行数据,不过我们注意到,由于这些检验不使用遗传数据,它们可用于推断任何包含兄弟姐妹及其性状值的队列或健康登记处中的遗传结构。我们描述了我们的方法如何有可能帮助理清极端性状值的遗传和环境原因,并提高未来检测罕见变异的测序研究的设计和效能。