Arbesfeld Jeremy A, Da Estelle Y, Stevenson James S, Kuzma Kori, Paul Anika, Farris Tierra, Capodanno Benjamin J, Grindstaff Sally B, Riehle Kevin, Saraiva-Agostinho Nuno, Safer Jordan F, Casper Jonathan, Haeussler Maximilian, Milosavljevic Aleksandar, Foreman Julia, Firth Helen V, Hunt Sarah E, Iqbal Sumaiya, Cline Melissa S, Rubin Alan F, Wagner Alex H
The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.
Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia.
Genome Biol. 2025 Jun 25;26(1):179. doi: 10.1186/s13059-025-03647-x.
BACKGROUND: Experimental data from functional assays have a critical role in interpreting the impact of genetic variants. Assay data must be unambiguously mapped to a reference genome to make it accessible, but it is often reported relative to assay-specific sequences, complicating downstream use and integration of variant data across resources. To make multiplexed assays of variant effect (MAVE) data more broadly available to the research and clinical communities, the Atlas of Variant Effects Alliance mapped MAVE data from the MaveDB community database to human reference sequences, creating an extensive set of machine-readable homology mappings that are incorporated into widely used human genomics applications. RESULTS: Here, we map approximately 9.0 million individual protein and nucleotide variants in MaveDB to the human genome, describing the examined variants with respect to human reference sequences while preserving the data provenance of the original MAVE sequences. We then disseminate the results to major genomic resources including the Genomics 2 Proteins Portal, UCSC Genome Browser, Ensembl Variant Effect Predictor, and DECIPHER platform. Within these applications, MAVE variants can now be visualized and integrated with other relevant clinical and biological data, making additional knowledge available when performing variant interpretation and conducting other research activities. CONCLUSIONS: Mapping MAVE variants to human reference sequences and sharing the mapped dataset with several key human genomics applications enables a new and diverse set of applications for MAVE data. This study provides increased access to functional data that can assist in clinical variant interpretation pipelines and enable biomedical research and discovery.
背景:功能分析的实验数据在解释基因变异的影响方面起着关键作用。分析数据必须明确地映射到参考基因组以便于获取,但它通常是相对于特定分析序列进行报告的,这使得下游使用和跨资源整合变异数据变得复杂。为了使变异效应多重分析(MAVE)数据更广泛地提供给研究和临床社区,变异效应图谱联盟将来自MaveDB社区数据库的MAVE数据映射到人类参考序列,创建了一组广泛的机器可读同源映射,这些映射被纳入广泛使用的人类基因组学应用程序中。 结果:在这里,我们将MaveDB中大约900万个个体蛋白质和核苷酸变异映射到人类基因组,在保留原始MAVE序列数据来源的同时,根据人类参考序列描述所检测的变异。然后,我们将结果分发给包括基因组学2蛋白质门户、加州大学圣克鲁兹分校基因组浏览器、Ensembl变异效应预测器和DECIPHER平台在内的主要基因组资源。在这些应用程序中,现在可以可视化MAVE变异并将其与其他相关临床和生物学数据整合,在进行变异解释和开展其他研究活动时提供更多知识。 结论:将MAVE变异映射到人类参考序列并与几个关键的人类基因组学应用程序共享映射数据集,为MAVE数据带来了一系列新的多样化应用。本研究增加了对功能数据的获取,有助于临床变异解释流程,并推动生物医学研究与发现。
Genome Biol. 2025-6-25
bioRxiv. 2024-6-30
Cochrane Database Syst Rev. 2023-11-27
Health Technol Assess. 2006-9
Arch Ital Urol Androl. 2025-6-30
Cochrane Database Syst Rev. 2003
Cochrane Database Syst Rev. 2021-9-6
Health Technol Assess. 2001
Nucleic Acids Res. 2025-1-6
Genome Med. 2024-11-6
Nucleic Acids Res. 2024-1-5
Genome Biol. 2023-7-3
Pac Symp Biocomput. 2023