Suppr超能文献

利用 T2T 组装技术解决参考基因组缺口处的罕见致病性倒位。

Leveraging the T2T assembly to resolve rare and pathogenic inversions in reference genome gaps.

机构信息

Department of Molecular Medicine and Surgery, Karolinska Institute, 171 76 Stockholm, Sweden.

Science for Life Laboratory, Karolinska Insitutet, 171 65 Solna, Sweden.

出版信息

Genome Res. 2024 Nov 20;34(11):1785-1797. doi: 10.1101/gr.279346.124.

Abstract

Chromosomal inversions (INVs) are particularly challenging to detect due to their copy-number neutral state and association with repetitive regions. Inversions represent about 1/20 of all balanced structural chromosome aberrations and can lead to disease by gene disruption or altering regulatory regions of dosage-sensitive genes in Short-read genome sequencing (srGS) can only resolve ∼70% of cytogenetically visible inversions referred to clinical diagnostic laboratories, likely due to breakpoints in repetitive regions. Here, we study 12 inversions by long-read genome sequencing (lrGS) ( = 9) or srGS ( = 3) and resolve nine of them. In four cases, the inversion breakpoint region was missing from at least one of the human reference genomes (GRCh37, GRCh38, T2T-CHM13) and a reference agnostic analysis was needed. One of these cases, an INV9 mappable only in de novo assembled lrGS data using T2T-CHM13 disrupts consistent with a Mendelian diagnosis (Kleefstra syndrome 1; MIM#610253). Next, by pairwise comparison between T2T-CHM13, GRCh37, and GRCh38, as well as the chimpanzee and bonobo, we show that hundreds of megabases of sequence are missing from at least one human reference, highlighting that primate genomes contribute to genomic diversity. Aligning population genomic data to these regions indicated that these regions are variable between individuals. Our analysis emphasizes that T2T-CHM13 is necessary to maximize the value of lrGS for optimal inversion detection in clinical diagnostics. These results highlight the importance of leveraging diverse and comprehensive reference genomes to resolve unsolved molecular cases in rare diseases.

摘要

染色体倒位(INV)由于其拷贝数中性状态和与重复区域的关联,特别难以检测。INV 约占所有平衡结构染色体畸变的 1/20,通过基因破坏或改变剂量敏感基因的调控区域,可导致疾病。在短读长基因组测序(srGS)中,大约只有 70%的可在细胞遗传学上观察到的 INV 可被解析,这可能是由于重复区域的断点所致。在这里,我们通过长读长基因组测序(lrGS)(=9)或 srGS(=3)研究了 12 个 INV,并解析了其中的 9 个。在 4 个案例中,INV 断点区域至少缺失于一个人类参考基因组(GRCh37、GRCh38、T2T-CHM13),因此需要进行无参考基因组分析。其中一个案例,INV9 仅在使用 T2T-CHM13 的从头组装 lrGS 数据中可定位,破坏了一个与孟德尔诊断一致的基因(Kleefstra 综合征 1;MIM#610253)。接下来,通过 T2T-CHM13、GRCh37 和 GRCh38 之间,以及黑猩猩和倭黑猩猩之间的成对比较,我们表明至少有一个人类参考基因组缺失了数百兆碱基对的序列,这突出表明灵长类基因组为基因组多样性做出了贡献。将群体基因组数据与这些区域进行比对表明,这些区域在个体之间存在变异性。我们的分析强调,为了在临床诊断中实现最佳 INV 检测,使用 T2T-CHM13 进行 lrGS 是必要的。这些结果强调了利用多样化和全面的参考基因组来解决罕见疾病中未解决的分子病例的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a5d/11610578/ef69dc44c93d/1785f01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验