• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用高保真测序对高度相似的旁系同源基因进行全基因组分析。

Genome-wide profiling of highly similar paralogous genes using HiFi sequencing.

作者信息

Chen Xiao, Baker Daniel, Dolzhenko Egor, Devaney Joseph M, Noya Jessica, Berlyoung April S, Brandon Rhonda, Hruska Kathleen S, Lochovsky Lucas, Kruszka Paul, Newman Scott, Farrow Emily, Thiffault Isabelle, Pastinen Tomi, Kasperaviciute Dalia, Gilissen Christian, Vissers Lisenka, Hoischen Alexander, Berger Seth, Vilain Eric, Délot Emmanuèle, Eberle Michael A

机构信息

PacBio, Menlo Park, CA, USA.

GeneDx, Gaithersburg, MD, USA.

出版信息

Nat Commun. 2025 Mar 8;16(1):2340. doi: 10.1038/s41467-025-57505-2.

DOI:10.1038/s41467-025-57505-2
PMID:40057485
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11890787/
Abstract

Variant calling is hindered in segmental duplications by sequence homology. We developed Paraphase, a HiFi-based informatics method that resolves highly similar genes by phasing all haplotypes of paralogous genes together. We applied Paraphase to 160 long (>10 kb) segmental duplication regions across the human genome with high (>99%) sequence similarity, encoding 316 genes. Analysis across five ancestral populations revealed highly variable copy numbers of these regions. We identified 23 paralog groups with exceptionally low within-group diversity, where extensive gene conversion and unequal crossing over contribute to highly similar gene copies. Furthermore, our analysis of 36 trios identified 7 de novo SNVs and 4 de novo gene conversion events, 2 of which are non-allelic. Finally, we summarized extensive genetic diversity in 9 medically relevant genes previously considered challenging to genotype. Paraphase provides a framework for resolving gene paralogs, enabling accurate testing in medically relevant genes and population-wide studies of previously inaccessible genes.

摘要

由于序列同源性,在片段重复中进行变异检测受到阻碍。我们开发了Paraphase,这是一种基于高保真(HiFi)的信息学方法,通过对旁系同源基因的所有单倍型进行定相来解析高度相似的基因。我们将Paraphase应用于人类基因组中160个长度大于10 kb、序列相似性高(>99%)的片段重复区域,这些区域编码316个基因。对五个祖先群体的分析揭示了这些区域的拷贝数高度可变。我们鉴定出23个旁系同源基因群,其群体内多样性极低,其中广泛的基因转换和不等交换导致基因拷贝高度相似。此外,我们对36个三联体的分析鉴定出7个新生单核苷酸变异(SNV)和4个新生基因转换事件,其中2个是非等位的。最后,我们总结了9个先前认为基因分型具有挑战性的医学相关基因中的广泛遗传多样性。Paraphase为解析基因旁系同源物提供了一个框架,能够在医学相关基因中进行准确检测,并对以前难以获取的基因进行全人群研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fa6/11890787/9e089b743413/41467_2025_57505_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fa6/11890787/853daf89725c/41467_2025_57505_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fa6/11890787/b5c0fd8f19d8/41467_2025_57505_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fa6/11890787/5b8cb870cc41/41467_2025_57505_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fa6/11890787/30b18419aa70/41467_2025_57505_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fa6/11890787/9e089b743413/41467_2025_57505_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fa6/11890787/853daf89725c/41467_2025_57505_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fa6/11890787/b5c0fd8f19d8/41467_2025_57505_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fa6/11890787/5b8cb870cc41/41467_2025_57505_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fa6/11890787/30b18419aa70/41467_2025_57505_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fa6/11890787/9e089b743413/41467_2025_57505_Fig5_HTML.jpg

相似文献

1
Genome-wide profiling of highly similar paralogous genes using HiFi sequencing.使用高保真测序对高度相似的旁系同源基因进行全基因组分析。
Nat Commun. 2025 Mar 8;16(1):2340. doi: 10.1038/s41467-025-57505-2.
2
Increased mutation and gene conversion within human segmental duplications.人类片段重复序列中突变和基因转换的增加。
Nature. 2023 May;617(7960):325-334. doi: 10.1038/s41586-023-05895-y. Epub 2023 May 10.
3
Gaps and complex structurally variant loci in phased genome assemblies.分相基因组组装中的缺口和复杂结构变异位点。
Genome Res. 2023 Apr;33(4):496-510. doi: 10.1101/gr.277334.122. Epub 2023 May 10.
4
Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies.从长读序列组装和短读基因组测序技术中检测结构变异的预期和盲点。
Am J Hum Genet. 2021 May 6;108(5):919-928. doi: 10.1016/j.ajhg.2021.03.014. Epub 2021 Mar 30.
5
Analysis of targeted and whole genome sequencing of PacBio HiFi reads for a comprehensive genotyping of gene-proximal and phenotype-associated Variable Number Tandem Repeats.对PacBio HiFi reads进行靶向和全基因组测序分析,以全面基因分型基因近端和表型相关的可变数目串联重复序列。
PLoS Comput Biol. 2025 Apr 7;21(4):e1012885. doi: 10.1371/journal.pcbi.1012885. eCollection 2025 Apr.
6
Long-read sequencing resolves the clinically relevant locus, supporting a new clinical test for Congenital Adrenal Hyperplasia.长读长测序解析了临床相关位点,为先天性肾上腺皮质增生症的新临床检测提供了支持。
medRxiv. 2025 Feb 10:2025.02.07.25321404. doi: 10.1101/2025.02.07.25321404.
7
A systematic strategy for identifying causal single nucleotide polymorphisms and their target genes on Juvenile arthritis risk haplotypes.一种系统性策略,用于鉴定青少年关节炎风险单核苷酸多态性及其靶基因。
BMC Med Genomics. 2024 Jul 12;17(1):185. doi: 10.1186/s12920-024-01954-z.
8
Exploring the complexity of systemic sclerosis etiology by trio whole genome sequencing.通过三员全基因组测序探索系统性硬化症病因的复杂性。
Hum Mol Genet. 2024 Sep 19;33(19):1643-1647. doi: 10.1093/hmg/ddae105.
9
Specific correlation between the major chromosome 10q26 haplotype conferring risk for age-related macular degeneration and the expression of .赋予年龄相关性黄斑变性风险的主要染色体10q26单倍型与……的表达之间的特定相关性。 (注:原文中“the expression of”后面缺少具体内容)
Mol Vis. 2017 Jun 14;23:318-333. eCollection 2017.
10
Direct long-read visualization reveals hidden variation in GCH1 gene copy number and precise expansion steps.直接长读长可视化揭示了GCH1基因拷贝数的隐藏变异和精确的扩增步骤。
BMC Genomics. 2025 Jul 17;26(1):671. doi: 10.1186/s12864-025-11859-5.

引用本文的文献

1
The Platinum Pedigree: a long-read benchmark for genetic variants.铂金谱系:遗传变异的长读长基准
Nat Methods. 2025 Aug;22(8):1669-1676. doi: 10.1038/s41592-025-02750-y. Epub 2025 Aug 4.
2
Long and Accurate: How HiFi Sequencing is Transforming Genomics.长读长且准确:高保真测序如何改变基因组学
Genomics Proteomics Bioinformatics. 2025 May 10;23(1). doi: 10.1093/gpbjnl/qzaf003.
3
The additional diagnostic yield of long-read sequencing in undiagnosed rare diseases.长读长测序在未确诊罕见病中的额外诊断价值。

本文引用的文献

1
HiFi long-read genomes for difficult-to-detect, clinically relevant variants.用于检测难以发现的临床相关变异的高保真长读长基因组。
Am J Hum Genet. 2025 Feb 6;112(2):450-456. doi: 10.1016/j.ajhg.2024.12.013. Epub 2025 Jan 13.
2
Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation.对 41755 个外显子组中的直系同源区域进行系统分析揭示了具有临床意义的变异。
Nat Commun. 2023 Oct 27;14(1):6845. doi: 10.1038/s41467-023-42531-9.
3
A pangenome reference of 36 Chinese populations.36 个中国人群的泛基因组参考图谱。
Genome Res. 2025 Apr 14;35(4):559-571. doi: 10.1101/gr.279970.124.
4
GREGoR: Accelerating Genomics for Rare Diseases.GREGoR:加速罕见病基因组学研究
ArXiv. 2024 Dec 18:arXiv:2412.14338v1.
Nature. 2023 Jul;619(7968):112-121. doi: 10.1038/s41586-023-06173-7. Epub 2023 Jun 14.
4
A draft human pangenome reference.人类泛基因组参考草图。
Nature. 2023 May;617(7960):312-324. doi: 10.1038/s41586-023-05896-x. Epub 2023 May 10.
5
Increased mutation and gene conversion within human segmental duplications.人类片段重复序列中突变和基因转换的增加。
Nature. 2023 May;617(7960):325-334. doi: 10.1038/s41586-023-05895-y. Epub 2023 May 10.
6
Gaps and complex structurally variant loci in phased genome assemblies.分相基因组组装中的缺口和复杂结构变异位点。
Genome Res. 2023 Apr;33(4):496-510. doi: 10.1101/gr.277334.122. Epub 2023 May 10.
7
Comprehensive de novo mutation discovery with HiFi long-read sequencing.利用 HiFi 长读测序进行全面的从头突变发现。
Genome Med. 2023 May 8;15(1):34. doi: 10.1186/s13073-023-01183-6.
8
Comprehensive SMN1 and SMN2 profiling for spinal muscular atrophy analysis using long-read PacBio HiFi sequencing.使用长读长 PacBio HiFi 测序进行脊髓性肌萎缩症分析的全面 SMN1 和 SMN2 分析。
Am J Hum Genet. 2023 Feb 2;110(2):240-250. doi: 10.1016/j.ajhg.2023.01.001. Epub 2023 Jan 19.
9
PBSIM3: a simulator for all types of PacBio and ONT long reads.PBSIM3:一款适用于所有类型的PacBio和ONT长读长的模拟器。
NAR Genom Bioinform. 2022 Dec 1;4(4):lqac092. doi: 10.1093/nargab/lqac092. eCollection 2022 Dec.
10
Diagnostic analysis of the highly complex OPN1LW/OPN1MW gene cluster using long-read sequencing and MLPA.使用长读长测序和多重连接依赖探针扩增技术对高度复杂的视蛋白1长波/视蛋白1中波基因簇进行诊断分析。
NPJ Genom Med. 2022 Nov 9;7(1):65. doi: 10.1038/s41525-022-00334-9.