Suppr超能文献

等位基因特异性表达质量控制填补了转录组辅助罕见变异解读中的关键空白。

Allele Specific Expression Quality Control Fills Critical Gap in Transcriptome Assisted Rare Variant Interpretation.

作者信息

Ganapathy Kaushik Ram, Song Eric, Munro Daniel, Torkamani Ali, Mohammadi Pejman

机构信息

Dept. of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA.

Scripps Research Translational Institute, La Jolla, CA, USA.

出版信息

bioRxiv. 2025 Jun 8:2025.05.30.657086. doi: 10.1101/2025.05.30.657086.

Abstract

Allele-specific expression (ASE) captures the functional impact of genetic variation on transcription, offering a high-resolution view of cis-regulatory effects, but its quality can be diminished by technical, biological, and analysis artifacts. We introduce aseQC, a statistical framework that quantifies sample-level ASE quality in terms of the overall expected extra-binomial variation to exclude uncharacteristically noisy samples in a cohort to improve robustness of downstream analyses. Applying aseQC to a dataset of rare mendelian muscular disorders, successfully identified previously annotated low-quality cases demonstrating clinical genomic utility. When applied to 15,253 samples in extensively quality controlled GTEx project data, aseQC uncovered 563 low-quality samples that exhibit excessive allelic imbalance. We identify these to be associated with specific processing dates but not otherwise described adequately by any other quality control measures and metadata available in GTEx data. We show that these low-quality samples lead to 23.6 and 31.6 -fold increased ASE, and splicing outliers, degrading the performance of transcriptome analysis for rare variant interpretation. In contrast, we did not observe any adverse effect associated with inclusion of these samples in common-variant analysis using quantitative traits loci mapping. By enabling quick and reliable assessment of sample quality, aseQC presents a critical step for identifying subtle quality issues that remain critical for a successful analysis of rare variant effects using transcriptome data.

摘要

等位基因特异性表达(ASE)反映了基因变异对转录的功能影响,提供了顺式调控效应的高分辨率视图,但其质量可能会因技术、生物学和分析假象而降低。我们引入了aseQC,这是一个统计框架,它根据总体预期的超二项变异来量化样本水平的ASE质量,以排除队列中异常嘈杂的样本,从而提高下游分析的稳健性。将aseQC应用于罕见孟德尔肌肉疾病的数据集,成功识别出先前注释的低质量病例,证明了临床基因组学的实用性。当应用于经过广泛质量控制的GTEx项目数据中的15253个样本时,aseQC发现了563个表现出过度等位基因不平衡的低质量样本。我们发现这些样本与特定的处理日期相关,但GTEx数据中可用的任何其他质量控制措施和元数据均未对其进行充分描述。我们表明,这些低质量样本导致ASE增加23.6倍和31.6倍,并出现剪接异常值,从而降低了用于罕见变异解释的转录组分析性能。相比之下,我们未观察到在使用数量性状位点定位进行常见变异分析时纳入这些样本会产生任何不利影响。通过实现对样本质量的快速可靠评估,aseQC是识别微妙质量问题的关键一步,而这些问题对于使用转录组数据成功分析罕见变异效应仍然至关重要。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验