• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

阿什肯纳兹人参考基因组的组装和注释。

Assembly and annotation of an Ashkenazi human reference genome.

机构信息

Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.

出版信息

Genome Biol. 2020 Jun 2;21(1):129. doi: 10.1186/s13059-020-02047-7.

DOI:10.1186/s13059-020-02047-7
PMID:32487205
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7265644/
Abstract

BACKGROUND

Thousands of experiments and studies use the human reference genome as a resource each year. This single reference genome, GRCh38, is a mosaic created from a small number of individuals, representing a very small sample of the human population. There is a need for reference genomes from multiple human populations to avoid potential biases.

RESULTS

Here, we describe the assembly and annotation of the genome of an Ashkenazi individual and the creation of a new, population-specific human reference genome. This genome is more contiguous and more complete than GRCh38, the latest version of the human reference genome, and is annotated with highly similar gene content. The Ashkenazi reference genome, Ash1, contains 2,973,118,650 nucleotides as compared to 2,937,639,212 in GRCh38. Annotation identified 20,157 protein-coding genes, of which 19,563 are > 99% identical to their counterparts on GRCh38. Most of the remaining genes have small differences. Forty of the protein-coding genes in GRCh38 are missing from Ash1; however, all of these genes are members of multi-gene families for which Ash1 contains other copies. Eleven genes appear on different chromosomes from their homologs in GRCh38. Alignment of DNA sequences from an unrelated Ashkenazi individual to Ash1 identified ~ 1 million fewer homozygous SNPs than alignment of those same sequences to the more-distant GRCh38 genome, illustrating one of the benefits of population-specific reference genomes.

CONCLUSIONS

The Ash1 genome is presented as a reference for any genetic studies involving Ashkenazi Jewish individuals.

摘要

背景

每年都有成千上万的实验和研究使用人类参考基因组作为资源。这个单一的参考基因组 GRCh38 是由少数个体创建的马赛克,代表了人类种群的一小部分。需要来自多个人类群体的参考基因组,以避免潜在的偏差。

结果

在这里,我们描述了一个阿什肯纳兹个体的基因组组装和注释,以及一个新的、特定于人群的人类参考基因组的创建。与最新版本的人类参考基因组 GRCh38 相比,这个基因组更连续、更完整,并且具有高度相似的基因内容注释。阿什肯纳兹参考基因组 Ash1 包含 2973118650 个核苷酸,而 GRCh38 包含 2937639212 个核苷酸。注释确定了 20157 个蛋白质编码基因,其中 19563 个与 GRCh38 上的对应基因相似度>99%。其余大多数基因只有微小的差异。GRCh38 中缺失了 40 个蛋白质编码基因;然而,所有这些基因都是多基因家族的成员,Ash1 中包含了其他的拷贝。11 个基因在染色体上的位置与 GRCh38 中的同源基因不同。将一个与 Ash1 无关的阿什肯纳兹个体的 DNA 序列与 Ash1 进行比对,与将同一序列与更远的 GRCh38 基因组进行比对相比,发现了大约 100 万个纯合 SNP,这说明了特定于人群的参考基因组的一个好处。

结论

Ash1 基因组被作为任何涉及阿什肯纳兹犹太个体的遗传研究的参考基因组。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf4f/7265644/6d63d8f40f15/13059_2020_2047_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf4f/7265644/6d63d8f40f15/13059_2020_2047_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf4f/7265644/6d63d8f40f15/13059_2020_2047_Fig1_HTML.jpg

相似文献

1
Assembly and annotation of an Ashkenazi human reference genome.阿什肯纳兹人参考基因组的组装和注释。
Genome Biol. 2020 Jun 2;21(1):129. doi: 10.1186/s13059-020-02047-7.
2
A reference-quality, fully annotated genome from a Puerto Rican individual.一份来自波多黎各个体的参考质量、完全注释的基因组。
Genetics. 2022 Feb 4;220(2). doi: 10.1093/genetics/iyab227.
3
De Novo Assembly of Two Swedish Genomes Reveals Missing Segments from the Human GRCh38 Reference and Improves Variant Calling of Population-Scale Sequencing Data.两个瑞典基因组的从头组装揭示了人类GRCh38参考基因组中缺失的片段,并改进了群体规模测序数据的变异检测。
Genes (Basel). 2018 Oct 9;9(10):486. doi: 10.3390/genes9100486.
4
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
5
Discrepancies between human DNA, mRNA and protein reference sequences and their relation to single nucleotide variants in the human population.人类DNA、mRNA和蛋白质参考序列之间的差异及其与人类群体中单核苷酸变异的关系。
Database (Oxford). 2016 Sep 1;2016. doi: 10.1093/database/baw124. Print 2016.
6
Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis.GRCh38人类参考基因组对高通量测序数据分析的改进及影响
Genomics. 2017 Mar;109(2):83-90. doi: 10.1016/j.ygeno.2017.01.005. Epub 2017 Jan 26.
7
Alignment of 1000 Genomes Project reads to reference assembly GRCh38.将 1000 基因组计划的读取与参考组装 GRCh38 对齐。
Gigascience. 2017 Jul 1;6(7):1-8. doi: 10.1093/gigascience/gix038.
8
A High Quality Asian Genome Assembly Identifies Features of Common Missing Regions.一个高质量的亚洲基因组组装揭示了常见缺失区域的特征。
Genes (Basel). 2020 Nov 13;11(11):1350. doi: 10.3390/genes11111350.
9
Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.对GRCh38和从头单倍体基因组组装的评估证明了参考组装的持久质量。
Genome Res. 2017 May;27(5):849-864. doi: 10.1101/gr.213611.116. Epub 2017 Apr 10.
10
Assembly of a pan-genome from deep sequencing of 910 humans of African descent.从非洲裔 910 人的深度测序中组装泛基因组。
Nat Genet. 2019 Jan;51(1):30-35. doi: 10.1038/s41588-018-0273-y. Epub 2018 Nov 19.

引用本文的文献

1
Near-complete Middle Eastern genomes refine autozygosity and enhance disease-causing and population-specific variant discovery.近乎完整的中东基因组改善了纯合性,并增强了致病和群体特异性变异的发现。
Nat Genet. 2025 May;57(5):1119-1131. doi: 10.1038/s41588-025-02173-7. Epub 2025 May 5.
2
RAmbler resolves complex repeats in human Chromosomes 8, 19, and X.RAmbler解析人类8号、19号和X染色体中的复杂重复序列。
Genome Res. 2025 Apr 14;35(4):863-876. doi: 10.1101/gr.279308.124.
3
Combining DNA and protein alignments to improve genome annotation with LiftOn.

本文引用的文献

1
The genome polishing tool POLCA makes fast and accurate corrections in genome assemblies.基因组精修工具 POLCA 可快速准确地对基因组组装进行修正。
PLoS Comput Biol. 2020 Jun 26;16(6):e1007981. doi: 10.1371/journal.pcbi.1007981. eCollection 2020 Jun.
2
The mutational constraint spectrum quantified from variation in 141,456 humans.从 141456 名人类个体的变异中量化的突变约束谱。
Nature. 2020 May;581(7809):434-443. doi: 10.1038/s41586-020-2308-7. Epub 2020 May 27.
3
Is it time to change the reference genome?是否到了改变参考基因组的时候了?
结合DNA和蛋白质比对,利用LiftOn改进基因组注释。
Genome Res. 2025 Feb 14;35(2):311-325. doi: 10.1101/gr.279620.124.
4
Upstream open reading frames may contain hundreds of novel human exons.上游开放阅读框可能包含数百个新的人类外显子。
PLoS Comput Biol. 2024 Nov 20;20(11):e1012543. doi: 10.1371/journal.pcbi.1012543. eCollection 2024 Nov.
5
The GIAB genomic stratifications resource for human reference genomes.GIAB 基因组分层资源用于人类参考基因组。
Nat Commun. 2024 Oct 19;15(1):9029. doi: 10.1038/s41467-024-53260-y.
6
AsmMix: an efficient haplotype-resolved hybrid genome assembling pipeline.AsmMix:一种高效的单倍型解析混合基因组组装流程。
Front Genet. 2024 Jul 26;15:1421565. doi: 10.3389/fgene.2024.1421565. eCollection 2024.
7
Three-dimensional genome architecture persists in a 52,000-year-old woolly mammoth skin sample.三维基因组结构在一只 52000 年前的长毛猛犸象皮肤样本中得以保留。
Cell. 2024 Jul 11;187(14):3541-3562.e51. doi: 10.1016/j.cell.2024.06.002.
8
Limitations of current high-throughput sequencing technologies lead to biased expression estimates of endogenous retroviral elements.当前高通量测序技术的局限性导致对内源性逆转录病毒元件的表达估计存在偏差。
NAR Genom Bioinform. 2024 Jul 9;6(3):lqae081. doi: 10.1093/nargab/lqae081. eCollection 2024 Sep.
9
Combining DNA and protein alignments to improve genome annotation with LiftOn.结合DNA和蛋白质比对,利用LiftOn改进基因组注释。
bioRxiv. 2024 May 17:2024.05.16.593026. doi: 10.1101/2024.05.16.593026.
10
Upstream open reading frames may contain hundreds of novel human exons.上游开放阅读框可能包含数百个新的人类外显子。
bioRxiv. 2024 Apr 1:2024.03.22.586333. doi: 10.1101/2024.03.22.586333.
Genome Biol. 2019 Aug 9;20(1):159. doi: 10.1186/s13059-019-1774-4.
4
The EMBL-EBI search and sequence analysis tools APIs in 2019.2019 年的 EMBL-EBI 搜索和序列分析工具 API。
Nucleic Acids Res. 2019 Jul 2;47(W1):W636-W641. doi: 10.1093/nar/gkz268.
5
An open resource for accurately benchmarking small variant and reference calls.用于准确基准测试小型变体和参考调用的开放资源。
Nat Biotechnol. 2019 May;37(5):561-566. doi: 10.1038/s41587-019-0074-6. Epub 2019 Apr 1.
6
Best practices for benchmarking germline small-variant calls in human genomes.人类基因组中小变异calls 的基准测试最佳实践。
Nat Biotechnol. 2019 May;37(5):555-560. doi: 10.1038/s41587-019-0054-x. Epub 2019 Mar 11.
7
CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise.CHESS:从数千个大规模 RNA 测序实验中精心挑选的新人类基因目录揭示了广泛的转录噪声。
Genome Biol. 2018 Nov 28;19(1):208. doi: 10.1186/s13059-018-1590-2.
8
De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse populations.从头人类基因组组装揭示了不同人群中多种替代单倍型的图谱。
Nat Commun. 2018 Aug 2;9(1):3040. doi: 10.1038/s41467-018-05513-w.
9
A synthetic-diploid benchmark for accurate variant-calling evaluation.用于准确变异呼叫评估的合成二倍体基准。
Nat Methods. 2018 Aug;15(8):595-597. doi: 10.1038/s41592-018-0054-7. Epub 2018 Jul 16.
10
Minimap2: pairwise alignment for nucleotide sequences.Minimap2:核苷酸序列的两两比对。
Bioinformatics. 2018 Sep 15;34(18):3094-3100. doi: 10.1093/bioinformatics/bty191.