Tanudisastro Hope A, Cuomo Anna S E, Weisburd Ben, Welland Matthew, Spenceley Eleanor, Franklin Michael, Xue Angli, Bowen Blake, Wing Kristof, Tang Owen, Gray Michael, Reis Andre L M, Margoliash Jonathan, Kurtas Nehir E, Pullin Jeffrey M, Lee Arthur S, Brand Harrison, Harper Michael, Bobowik Katalina, Silk Michael, Marshall John, Bakiris Vivian, Madala Bindu Swapna, Uren Caitlin, Bartie Caitlin, Senabouth Anne, Dashnow Harriet, Fearnley Liam, Martin-Trujillo Alejandro, Dolzhenko Egor, Qiao Zhen, Grieve Stuart M, Nguyen Tung, Ben-David Eyal, Chen Ling, Farh Kyle Kai-How, Talkowski Michael, Alexander Stephen I, Siggs Owen M, Gruenschloss Leonhard, Nicholas Hannah R, Piscionere Jennifer, Simons Cas, Wallace Chris, Gymrek Melissa, Deveson Ira W, Hewitt Alex W, Figtree Gemma A, de Lange Katrina M, Powell Joseph E, MacArthur Daniel G
bioRxiv. 2025 Apr 9:2024.11.02.621562. doi: 10.1101/2024.11.02.621562.
Tandem repeats (TRs) - highly polymorphic, repetitive sequences dispersed across the human genome - are crucial regulators of gene expression and diverse biological processes, but have remained underexplored relative to other classes of genetic variation due to historical challenges in their accurate calling and analysis. Here, we leverage whole genome and single-cell RNA sequencing from over 5.4 million blood-derived cells from 1,925 individuals to explore the impact of variation in over 1.7 million polymorphic TR loci on blood cell type-specific gene expression. We identify over 62,000 single-cell expression quantitative trait TR loci (sc-eTRs), 16.6% of which are specific to one of 28 distinct immune cell types. Further fine-mapping uncovers 4,283 sc-eTRs as candidate causal drivers of gene expression in 13.6% of genes tested genome-wide. We show through colocalization that TRs are likely mediators of genetic associations with immune-mediated and hematological traits in over 700 genes, and further identify novel TRs warranting investigation in rare disease cohorts. TRs are critical, yet long-overlooked, contributors to cell type-specific gene expression, with implications for understanding rare disease pathogenesis and the genetic architecture of complex traits.
串联重复序列(TRs)——高度多态的重复序列,分散在人类基因组中——是基因表达和多种生物学过程的关键调节因子,但由于其准确识别和分析方面的历史挑战,相对于其他类型的遗传变异,它们一直未得到充分探索。在这里,我们利用来自1925名个体的超过540万个血液来源细胞的全基因组和单细胞RNA测序,来探究超过170万个多态性TR位点的变异对血细胞类型特异性基因表达的影响。我们鉴定出超过62000个单细胞表达定量性状TR位点(sc-eTRs),其中16.6%特定于28种不同免疫细胞类型中的一种。进一步的精细定位揭示,在全基因组测试的13.6%的基因中,有4283个sc-eTRs是基因表达的候选因果驱动因素。我们通过共定位表明,TRs可能是超过700个基因中与免疫介导和血液学性状的遗传关联的介质,并进一步鉴定出在罕见病队列中值得研究的新型TRs。TRs是细胞类型特异性基因表达的关键但长期被忽视的贡献者,对理解罕见病发病机制和复杂性状的遗传结构具有重要意义。