Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Methods, Collaboration and Outreach Group, Genentech/Roche, South San Francisco, CA 94080, USA.
Am J Hum Genet. 2022 Mar 3;109(3):446-456. doi: 10.1016/j.ajhg.2022.01.017. Epub 2022 Feb 24.
Attempts to identify and prioritize functional DNA elements in coding and non-coding regions, particularly through use of in silico functional annotation data, continue to increase in popularity. However, specific functional roles can vary widely from one variant to another, making it challenging to summarize different aspects of variant function with a one-dimensional rating. Here we propose multi-dimensional annotation-class integrative estimation (MACIE), an unsupervised multivariate mixed-model framework capable of integrating annotations of diverse origin to assess multi-dimensional functional roles for both coding and non-coding variants. Unlike existing one-dimensional scoring methods, MACIE views variant functionality as a composite attribute encompassing multiple characteristics and estimates the joint posterior functional probabilities of each genomic position. This estimate offers more comprehensive and interpretable information in the presence of multiple aspects of functionality. Applied to a variety of independent coding and non-coding datasets, MACIE demonstrates powerful and robust performance in discriminating between functional and non-functional variants. We also show an application of MACIE to fine-mapping and heritability enrichment analysis by using the lipids GWAS summary statistics data from the European Network for Genetic and Genomic Epidemiology Consortium.
尝试识别和优先考虑编码和非编码区域中的功能 DNA 元件,特别是通过使用计算机功能注释数据,这种方法越来越受欢迎。然而,特定的功能角色在不同变体之间可能差异很大,因此很难用一维评分来总结变体功能的不同方面。在这里,我们提出了多维注释类别综合评估 (MACIE),这是一种无监督的多元混合模型框架,能够整合来自不同来源的注释,以评估编码和非编码变体的多维功能角色。与现有的一维评分方法不同,MACIE 将变体功能视为一个包含多个特征的综合属性,并估计每个基因组位置的联合后功能概率。在存在多种功能方面的情况下,这种估计提供了更全面和可解释的信息。应用于各种独立的编码和非编码数据集,MACIE 证明了在区分功能变体和非功能变体方面具有强大而稳健的性能。我们还展示了 MACIE 在精细映射和遗传力富集分析中的应用,使用了欧洲遗传和基因组流行病学联盟的脂质 GWAS 汇总统计数据。