• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

两个瑞典基因组的从头组装揭示了人类GRCh38参考基因组中缺失的片段,并改进了群体规模测序数据的变异检测。

De Novo Assembly of Two Swedish Genomes Reveals Missing Segments from the Human GRCh38 Reference and Improves Variant Calling of Population-Scale Sequencing Data.

作者信息

Ameur Adam, Che Huiwen, Martin Marcel, Bunikis Ignas, Dahlberg Johan, Höijer Ida, Häggqvist Susana, Vezzi Francesco, Nordlund Jessica, Olason Pall, Feuk Lars, Gyllensten Ulf

机构信息

Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, 752 36 Uppsala, Sweden.

Science for Life Laboratory, Department of Biochemistry and Biophysics (DBB), Stockholm University, 114 19 Stockholm, Sweden.

出版信息

Genes (Basel). 2018 Oct 9;9(10):486. doi: 10.3390/genes9100486.

DOI:10.3390/genes9100486
PMID:30304863
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6210158/
Abstract

The current human reference sequence (GRCh38) is a foundation for large-scale sequencing projects. However, recent studies have suggested that GRCh38 may be incomplete and give a suboptimal representation of specific population groups. Here, we performed a de novo assembly of two Swedish genomes that revealed over 10 Mb of sequences absent from the human GRCh38 reference in each individual. Around 6 Mb of these novel sequences (NS) are shared with a Chinese personal genome. The NS are highly repetitive, have an elevated GC-content, and are primarily located in centromeric or telomeric regions. Up to 1 Mb of NS can be assigned to chromosome Y, and large segments are also missing from GRCh38 at chromosomes 14, 17, and 21. Inclusion of NS into the GRCh38 reference radically improves the alignment and variant calling from short-read whole-genome sequencing data at several genomic loci. A re-analysis of a Swedish population-scale sequencing project yields > 75,000 putative novel single nucleotide variants (SNVs) and removes > 10,000 false positive SNV calls per individual, some of which are located in protein coding regions. Our results highlight that the GRCh38 reference is not yet complete and demonstrate that personal genome assemblies from local populations can improve the analysis of short-read whole-genome sequencing data.

摘要

当前的人类参考序列(GRCh38)是大规模测序项目的基础。然而,最近的研究表明,GRCh38可能不完整,无法很好地代表特定人群。在此,我们对两个瑞典基因组进行了从头组装,结果显示每个个体中都有超过10 Mb的序列在人类GRCh38参考序列中缺失。这些新序列(NS)中约6 Mb与一个中国个人基因组共有。NS具有高度重复性,GC含量升高,主要位于着丝粒或端粒区域。多达1 Mb的NS可定位到Y染色体,14号、17号和21号染色体上的GRCh38也缺失大片段。将NS纳入GRCh38参考序列可从根本上改善几个基因组位点短读长全基因组测序数据的比对和变异检测。对一个瑞典人群规模测序项目的重新分析产生了超过75,000个推定的新型单核苷酸变异(SNV),并消除了每个个体中超过10,000个假阳性SNV,其中一些位于蛋白质编码区域。我们的结果突出表明GRCh38参考序列尚未完整,并证明来自当地人群的个人基因组组装可以改善短读长全基因组测序数据的分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6210158/366767583302/genes-09-00486-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6210158/0cfcfae5b8f1/genes-09-00486-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6210158/70898b8faf91/genes-09-00486-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6210158/5a14738993fa/genes-09-00486-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6210158/fe0e614aa441/genes-09-00486-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6210158/366767583302/genes-09-00486-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6210158/0cfcfae5b8f1/genes-09-00486-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6210158/70898b8faf91/genes-09-00486-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6210158/5a14738993fa/genes-09-00486-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6210158/fe0e614aa441/genes-09-00486-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6210158/366767583302/genes-09-00486-g005.jpg

相似文献

1
De Novo Assembly of Two Swedish Genomes Reveals Missing Segments from the Human GRCh38 Reference and Improves Variant Calling of Population-Scale Sequencing Data.两个瑞典基因组的从头组装揭示了人类GRCh38参考基因组中缺失的片段,并改进了群体规模测序数据的变异检测。
Genes (Basel). 2018 Oct 9;9(10):486. doi: 10.3390/genes9100486.
2
Alignment of 1000 Genomes Project reads to reference assembly GRCh38.将 1000 基因组计划的读取与参考组装 GRCh38 对齐。
Gigascience. 2017 Jul 1;6(7):1-8. doi: 10.1093/gigascience/gix038.
3
Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis.GRCh38人类参考基因组对高通量测序数据分析的改进及影响
Genomics. 2017 Mar;109(2):83-90. doi: 10.1016/j.ygeno.2017.01.005. Epub 2017 Jan 26.
4
Alternate-locus aware variant calling in whole genome sequencing.全基因组测序中位点交替感知变异检测
Genome Med. 2016 Dec 13;8(1):130. doi: 10.1186/s13073-016-0383-z.
5
Discovery of Novel Sequences in 1,000 Swedish Genomes.在 1000 个瑞典基因组中发现新序列。
Mol Biol Evol. 2020 Jan 1;37(1):18-30. doi: 10.1093/molbev/msz176.
6
Variant calling on the GRCh38 assembly with the data from phase three of the 1000 Genomes Project.使用千人基因组计划第三阶段的数据在GRCh38装配上进行变异检测。
Wellcome Open Res. 2019 Dec 30;4:50. doi: 10.12688/wellcomeopenres.15126.2. eCollection 2019.
7
De Novo Genome Assemblies From Two Indigenous Americans from Arizona Identify New Polymorphisms in Non-Reference Sequences.从亚利桑那州的两位印第安原住民中从头组装基因组,鉴定出非参考序列中的新多态性。
Genome Biol Evol. 2024 Sep 3;16(9). doi: 10.1093/gbe/evae188.
8
Long-read sequencing and de novo assembly of a Chinese genome.长读测序和中国基因组的从头组装。
Nat Commun. 2016 Jun 30;7:12065. doi: 10.1038/ncomms12065.
9
Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.对GRCh38和从头单倍体基因组组装的评估证明了参考组装的持久质量。
Genome Res. 2017 May;27(5):849-864. doi: 10.1101/gr.213611.116. Epub 2017 Apr 10.
10
A pipeline for local assembly of minisatellite alleles from single-molecule sequencing data.一种用于从单分子测序数据中进行小卫星等位基因本地组装的流程。
Bioinformatics. 2017 Mar 1;33(5):650-653. doi: 10.1093/bioinformatics/btw687.

引用本文的文献

1
Toward a Kinh Vietnamese Reference Genome: Constructing a De Novo Genome Assembly Using Long-Read Sequencing and Optical Mapping.迈向京族越南人参考基因组:利用长读长测序和光学图谱构建从头基因组组装
Genes (Basel). 2025 Apr 29;16(5):536. doi: 10.3390/genes16050536.
2
Constructing a draft Indian cattle pangenome using short-read sequencing.利用短读长测序构建印度牛泛基因组草图。
Commun Biol. 2025 Apr 13;8(1):605. doi: 10.1038/s42003-025-07978-0.
3
Leveraging the T2T assembly to resolve rare and pathogenic inversions in reference genome gaps.

本文引用的文献

1
Single-Molecule Sequencing: Towards Clinical Applications.单分子测序:迈向临床应用。
Trends Biotechnol. 2019 Jan;37(1):72-85. doi: 10.1016/j.tibtech.2018.07.013. Epub 2018 Aug 13.
2
De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse populations.从头人类基因组组装揭示了不同人群中多种替代单倍型的图谱。
Nat Commun. 2018 Aug 2;9(1):3040. doi: 10.1038/s41467-018-05513-w.
3
Accurate detection of complex structural variations using single-molecule sequencing.利用单分子测序技术准确检测复杂结构变异。
利用 T2T 组装技术解决参考基因组缺口处的罕见致病性倒位。
Genome Res. 2024 Nov 20;34(11):1785-1797. doi: 10.1101/gr.279346.124.
4
De Novo Genome Assemblies From Two Indigenous Americans from Arizona Identify New Polymorphisms in Non-Reference Sequences.从亚利桑那州的两位印第安原住民中从头组装基因组,鉴定出非参考序列中的新多态性。
Genome Biol Evol. 2024 Sep 3;16(9). doi: 10.1093/gbe/evae188.
5
Human pangenome analysis of sequences missing from the reference genome reveals their widespread evolutionary, phenotypic, and functional roles.人类泛基因组分析缺失参考基因组序列揭示了它们广泛的进化、表型和功能作用。
Nucleic Acids Res. 2024 Mar 21;52(5):2212-2230. doi: 10.1093/nar/gkae086.
6
Copy number variations and their effect on the plasma proteome.拷贝数变异及其对血浆蛋白质组的影响。
Genetics. 2023 Dec 6;225(4). doi: 10.1093/genetics/iyad179.
7
Transposable element insertions in 1000 Swedish individuals.1000 名瑞典个体中的转座元件插入。
PLoS One. 2023 Jul 28;18(7):e0289346. doi: 10.1371/journal.pone.0289346. eCollection 2023.
8
The complete and fully-phased diploid genome of a male Han Chinese.一位男性汉族个体的完整、全面二倍体基因组。
Cell Res. 2023 Oct;33(10):745-761. doi: 10.1038/s41422-023-00849-5. Epub 2023 Jul 14.
9
Assembly-free discovery of human novel sequences using long reads.使用长读长进行无组装的人类新序列发现。
DNA Res. 2022 Dec 1;29(6). doi: 10.1093/dnares/dsac039.
10
Complex genomic rearrangements: an underestimated cause of rare diseases.复杂的基因组重排:罕见疾病被低估的病因。
Trends Genet. 2022 Nov;38(11):1134-1146. doi: 10.1016/j.tig.2022.06.003. Epub 2022 Jul 9.
Nat Methods. 2018 Jun;15(6):461-468. doi: 10.1038/s41592-018-0001-7. Epub 2018 Apr 30.
4
Nanopore sequencing and assembly of a human genome with ultra-long reads.纳米孔测序和超长读长组装人类基因组。
Nat Biotechnol. 2018 Apr;36(4):338-345. doi: 10.1038/nbt.4060. Epub 2018 Jan 29.
5
SweGen: a whole-genome data resource of genetic variability in a cross-section of the Swedish population.瑞典基因组计划(SweGen):瑞典人群横断面遗传变异的全基因组数据资源。
Eur J Hum Genet. 2017 Nov;25(11):1253-1260. doi: 10.1038/ejhg.2017.130. Epub 2017 Aug 23.
6
Sequencing and de novo assembly of 150 genomes from Denmark as a population reference.丹麦 150 个个体基因组的测序和从头组装作为一个群体参考。
Nature. 2017 Aug 3;548(7665):87-91. doi: 10.1038/nature23264. Epub 2017 Jul 26.
7
The promise of discovering population-specific disease-associated genes in South Asia.在南亚发现特定人群疾病相关基因的前景。
Nat Genet. 2017 Sep;49(9):1403-1407. doi: 10.1038/ng.3917. Epub 2017 Jul 17.
8
Alignment of 1000 Genomes Project reads to reference assembly GRCh38.将 1000 基因组计划的读取与参考组装 GRCh38 对齐。
Gigascience. 2017 Jul 1;6(7):1-8. doi: 10.1093/gigascience/gix038.
9
Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.对GRCh38和从头单倍体基因组组装的评估证明了参考组装的持久质量。
Genome Res. 2017 May;27(5):849-864. doi: 10.1101/gr.213611.116. Epub 2017 Apr 10.
10
Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome.单分子测序和染色质构象捕获技术助力家山羊基因组的从头参考组装。
Nat Genet. 2017 Apr;49(4):643-650. doi: 10.1038/ng.3802. Epub 2017 Mar 6.