Li Taibo, Ferraro Nicole, Strober Benjamin J, Aguet Francois, Kasela Silva, Arvanitis Marios, Ni Bohan, Wiel Laurens, Hershberg Elliot, Ardlie Kristin, Arking Dan E, Beer Rebecca L, Brody Jennifer, Blackwell Thomas W, Clish Clary, Gabriel Stacey, Gerszten Robert, Guo Xiuqing, Gupta Namrata, Johnson W Craig, Lappalainen Tuuli, Lin Henry J, Liu Yongmei, Nickerson Deborah A, Papanicolaou George, Pritchard Jonathan K, Qasba Pankaj, Shojaie Ali, Smith Josh, Sotoodehnia Nona, Taylor Kent D, Tracy Russell P, Van Den Berg David, Wheeler Matthew T, Rich Stephen S, Rotter Jerome I, Battle Alexis, Montgomery Stephen B
Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
Biomedical Informatics Training Program, Stanford University, Stanford, CA, USA.
Cell Genom. 2023 Sep 6;3(10):100401. doi: 10.1016/j.xgen.2023.100401. eCollection 2023 Oct 11.
Each human genome has tens of thousands of rare genetic variants; however, identifying impactful rare variants remains a major challenge. We demonstrate how use of personal multi-omics can enable identification of impactful rare variants by using the Multi-Ethnic Study of Atherosclerosis, which included several hundred individuals, with whole-genome sequencing, transcriptomes, methylomes, and proteomes collected across two time points, 10 years apart. We evaluated each multi-omics phenotype's ability to separately and jointly inform functional rare variation. By combining expression and protein data, we observed rare stop variants 62 times and rare frameshift variants 216 times as frequently as controls, compared to 13-27 times as frequently for expression or protein effects alone. We extended a Bayesian hierarchical model, "Watershed," to prioritize specific rare variants underlying multi-omics signals across the regulatory cascade. With this approach, we identified rare variants that exhibited large effect sizes on multiple complex traits including height, schizophrenia, and Alzheimer's disease.
每个人类基因组都有成千上万个罕见的基因变异;然而,识别有影响力的罕见变异仍然是一项重大挑战。我们通过动脉粥样硬化多民族研究展示了如何利用个人多组学技术来识别有影响力的罕见变异。该研究纳入了数百名个体,在相隔10年的两个时间点收集了全基因组测序、转录组、甲基化组和蛋白质组数据。我们评估了每种多组学表型单独和联合提供功能罕见变异信息的能力。通过结合表达数据和蛋白质数据,我们观察到罕见的终止变异出现的频率是对照组的62倍,罕见的移码变异出现的频率是对照组的216倍,而单独的表达或蛋白质效应出现的频率仅为对照组的13 - 27倍。我们扩展了一种贝叶斯层次模型“分水岭”,以对整个调控级联中多组学信号背后的特定罕见变异进行优先级排序。通过这种方法,我们识别出了对包括身高、精神分裂症和阿尔茨海默病在内的多种复杂性状具有较大效应大小的罕见变异。