Suppr超能文献

使用泛基因组对序列解析的拷贝数变异进行基因分型,揭示了重复基因的旁系同源物特异性全球多样性和表达差异。

Genotyping sequence-resolved copy number variation using pangenomes reveals paralog-specific global diversity and expression divergence of duplicated genes.

作者信息

Ma Walfred, Chaisson Mark

出版信息

bioRxiv. 2025 Apr 13:2024.08.11.607269. doi: 10.1101/2024.08.11.607269.

Abstract

Copy number variant (CNV) genes are important in evolution and disease, yet sequence variation in CNV genes remains a blind spot in large-scale studies. We present ctyper, a method that leverages pangenomes to produce allele-specific copy numbers with locally phased variants from next-generation sequencing (NGS) reads. Benchmarking on 3,351 CNV genes, including HLA, SMN, and CYP2D6, and 212 challenging medically relevant (CMR) genes that are poorly mapped by NGS, ctyper captures 96.5% of phased variants with ≥99.1% correctness of copy number on CNV genes and 94.8% of phased variants on CMR genes. Applying alignment-free algorithms, ctyper requires 1.5 hours per genome on a single CPU. The results improve prediction of gene expression compared to known expression quantitative trait loci (eQTL) variants. Allele-specific expression quantified divergent expression on 7.94% of paralogs and tissue-specific biases on 4.68% of paralogs. We found reduced expression of SMN-2 due to SMN1 conversion, potentially affecting spinal muscular atrophy, and increased expression of translocated duplications of AMY2B. Overall, ctyper enables biobank-scale genotyping of CNV and CMR genes.

摘要

拷贝数变异(CNV)基因在进化和疾病中具有重要意义,然而在大规模研究中,CNV基因的序列变异仍然是一个盲点。我们提出了ctyper方法,该方法利用泛基因组,通过下一代测序(NGS)读取的局部定相变异来生成等位基因特异性拷贝数。在3351个CNV基因(包括HLA、SMN和CYP2D6)以及212个NGS定位不佳的具有医学相关性的挑战性基因(CMR)上进行基准测试,ctyper在CNV基因上捕获了96.5%的定相变异,拷贝数正确性≥99.1%,在CMR基因上捕获了94.8%的定相变异。应用无比对算法,ctyper在单个CPU上每个基因组需要1.5小时。与已知的表达数量性状位点(eQTL)变异相比,该结果改善了基因表达的预测。等位基因特异性表达量化了7.94%的旁系同源基因的差异表达和4.68%的旁系同源基因的组织特异性偏差。我们发现由于SMN1转换导致SMN-2表达降低,这可能影响脊髓性肌萎缩症,并且AMY2B易位重复的表达增加。总体而言,ctyper能够对CNV和CMR基因进行生物样本库规模的基因分型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4642/11995949/2beeb5121835/nihpp-2024.08.11.607269v6-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验