Ungar Rachel A, Goddard Pagé C, Jensen Tanner D, Degalez Fabien, Smith Kevin S, Jin Christopher A, Bonner Devon E, Bernstein Jonathan A, Wheeler Matthew T, Montgomery Stephen B
Department of Genetics, School of Medicine, Stanford University.
Department of Pathology, School of Medicine, Stanford University.
medRxiv. 2024 Jan 12:2024.01.11.24301165. doi: 10.1101/2024.01.11.24301165.
Transcriptomics is a powerful tool for unraveling the molecular effects of genetic variants and disease diagnosis. Prior studies have demonstrated that choice of genome build impacts variant interpretation and diagnostic yield for genomic analyses. To identify the extent genome build also impacts transcriptomics analyses, we studied the effect of the hg19, hg38, and CHM13 genome builds on expression quantification and outlier detection in 386 rare disease and familial control samples from both the Undiagnosed Diseases Network (UDN) and Genomics Research to Elucidate the Genetics of Rare Disease (GREGoR) Consortium. We identified 2,800 genes with build-dependent quantification across six routinely-collected biospecimens, including 1,391 protein-coding genes and 341 known rare disease genes. We further observed multiple genes that only have detectable expression in a subset of genome builds. Finally, we characterized how genome build impacts the detection of outlier transcriptomic events. Combined, we provide a database of genes impacted by build choice, and recommend that transcriptomics-guided analyses and diagnoses are cross-referenced with these data for robustness.
转录组学是一种用于揭示基因变异的分子效应和疾病诊断的强大工具。先前的研究表明,基因组构建版本的选择会影响基因组分析的变异解读和诊断效率。为了确定基因组构建版本对转录组学分析的影响程度,我们研究了hg19、hg38和CHM13基因组构建版本对来自未确诊疾病网络(UDN)和阐明罕见病遗传学基因组学研究(GREGoR)联盟的386个罕见病和家族对照样本的表达定量和异常值检测的影响。我们在六种常规收集的生物样本中鉴定出2800个具有构建版本依赖性定量的基因,其中包括1391个蛋白质编码基因和341个已知的罕见病基因。我们还进一步观察到多个仅在一部分基因组构建版本中有可检测表达的基因。最后,我们描述了基因组构建版本如何影响异常转录组事件的检测。综合来看,我们提供了一个受构建版本选择影响的基因数据库,并建议将转录组学指导的分析和诊断与这些数据进行交叉参考,以确保结果的稳健性。