• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用纳米孔和高保真长读长进行高质量拟南芥基因组组装

High-quality Arabidopsis thaliana Genome Assembly with Nanopore and HiFi Long Reads.

作者信息

Wang Bo, Yang Xiaofei, Jia Yanyan, Xu Yu, Jia Peng, Dang Ningxin, Wang Songbo, Xu Tun, Zhao Xixi, Gao Shenghan, Dong Quanbin, Ye Kai

机构信息

MOE Key Laboratory for Intelligent Networks & Network Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.

School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.

出版信息

Genomics Proteomics Bioinformatics. 2022 Feb;20(1):4-13. doi: 10.1016/j.gpb.2021.08.003. Epub 2021 Sep 3.

DOI:10.1016/j.gpb.2021.08.003
PMID:34487862
原文链接:
https://pmc.ncbi.nlm.nih.gov/articles/PMC9510872/
Abstract

Arabidopsis thaliana is an important and long-established model species for plant molecular biology, genetics, epigenetics, and genomics. However, the latest version of reference genome still contains a significant number of missing segments. Here, we reported a high-quality and almost complete Col-0 genome assembly with two gaps (named Col-XJTU) by combining the Oxford Nanopore Technologies ultra-long reads, Pacific Biosciences high-fidelity long reads, and Hi-C data. The total genome assembly size is 133,725,193 bp, introducing 14.6 Mb of novel sequences compared to the TAIR10.1 reference genome. All five chromosomes of the Col-XJTU assembly are highly accurate with consensus quality (QV) scores > 60 (ranging from 62 to 68), which are higher than those of the TAIR10.1 reference (ranging from 45 to 52). We completely resolved chromosome (Chr) 3 and Chr5 in a telomere-to-telomere manner. Chr4 was completely resolved except the nucleolar organizing regions, which comprise long repetitive DNA fragments. The Chr1 centromere (CEN1), reportedly around 9 Mb in length, is particularly challenging to assemble due to the presence of tens of thousands of CEN180 satellite repeats. Using the cutting-edge sequencing data and novel computational approaches, we assembled a 3.8-Mb-long CEN1 and a 3.5-Mb-long CEN2. We also investigated the structure and epigenetics of centromeres. Four clusters of CEN180 monomers were detected, and the centromere-specific histone H3-like protein (CENH3) exhibited a strong preference for CEN180 Cluster 3. Moreover, we observed hypomethylation patterns in CENH3-enriched regions. We believe that this high-quality genome assembly, Col-XJTU, would serve as a valuable reference to better understand the global pattern of centromeric polymorphisms, as well as the genetic and epigenetic features in plants.

摘要

拟南芥是植物分子生物学、遗传学、表观遗传学和基因组学领域一种重要且长期使用的模式物种。然而,最新版本的参考基因组仍包含大量缺失片段。在此,我们通过结合牛津纳米孔技术超长读长、太平洋生物科学公司的高保真长读长以及Hi-C数据,报道了一个高质量且几乎完整的Col-0基因组组装体(命名为Col-XJTU),该组装体有两个缺口。基因组组装体的总大小为133,725,193碱基对,与TAIR10.1参考基因组相比,引入了14.6兆碱基的新序列。Col-XJTU组装体的所有五条染色体都高度准确,一致质量(QV)得分>60(范围为62至68),高于TAIR10.1参考基因组(范围为45至52)。我们以端粒到端粒的方式完全解析了3号染色体和5号染色体。4号染色体除核仁组织区外已完全解析,核仁组织区包含长重复DNA片段。据报道,1号染色体着丝粒(CEN1)长度约为9兆碱基,由于存在数以万计的CEN180卫星重复序列,其组装极具挑战性。利用前沿测序数据和新颖的计算方法,我们组装了一个3.8兆碱基长的CEN1和一个3.5兆碱基长的CEN2。我们还研究了着丝粒的结构和表观遗传学。检测到四个CEN180单体簇,着丝粒特异性组蛋白H3样蛋白(CENH3)对CEN180簇3表现出强烈偏好。此外,我们在CENH3富集区域观察到低甲基化模式。我们相信,这个高质量的基因组组装体Col-XJTU将作为一个有价值的参考,以更好地理解着丝粒多态性的全局模式以及植物中的遗传和表观遗传特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/e919a35d8adf/fx13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/ad610720333d/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/8718b39e6b00/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/a4abb3705e89/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/dc69532957d2/fx2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/7b50fbad2fa0/fx3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/812327c37677/fx4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/70f6b93a88ab/fx5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/3582a5a5b8d0/fx6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/c14276b35dd0/fx7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/eeddfd9bea1b/fx8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/cf75de9cc9a2/fx9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/5edc3b6968a7/fx10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/78a968899423/fx11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/cb581305df83/fx12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/e919a35d8adf/fx13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/ad610720333d/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/8718b39e6b00/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/a4abb3705e89/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/dc69532957d2/fx2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/7b50fbad2fa0/fx3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/812327c37677/fx4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/70f6b93a88ab/fx5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/3582a5a5b8d0/fx6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/c14276b35dd0/fx7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/eeddfd9bea1b/fx8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/cf75de9cc9a2/fx9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/5edc3b6968a7/fx10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/78a968899423/fx11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/cb581305df83/fx12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fa6/9510872/e919a35d8adf/fx13.jpg

相似文献

1
High-quality Arabidopsis thaliana Genome Assembly with Nanopore and HiFi Long Reads.利用纳米孔和高保真长读长进行高质量拟南芥基因组组装
Genomics Proteomics Bioinformatics. 2022 Feb;20(1):4-13. doi: 10.1016/j.gpb.2021.08.003. Epub 2021 Sep 3.
2
The genetic and epigenetic landscape of the centromeres.着丝粒的遗传和表观遗传景观。
Science. 2021 Nov 12;374(6569):eabi7489. doi: 10.1126/science.abi7489.
3
Centromere location in is unaltered by extreme divergence in CENH3 protein sequence.着丝粒位置不受CENH3蛋白序列极端差异的影响。
Genome Res. 2017 Mar;27(3):471-478. doi: 10.1101/gr.214619.116. Epub 2017 Feb 21.
4
Pushing the limits of HiFi assemblies reveals centromere diversity between two Arabidopsis thaliana genomes.推高 HiFi 组装的极限揭示了两个拟南芥基因组之间着丝粒的多样性。
Nucleic Acids Res. 2022 Nov 28;50(21):12309-12327. doi: 10.1093/nar/gkac1115.
5
The string decomposition problem and its applications to centromere analysis and assembly.字符串分解问题及其在着丝粒分析和组装中的应用。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i93-i101. doi: 10.1093/bioinformatics/btaa454.
6
Epigenetic modification of centromeric chromatin: hypomethylation of DNA sequences in the CENH3-associated chromatin in Arabidopsis thaliana and maize.着丝粒染色质的表观遗传修饰:拟南芥和玉米中与CENH3相关染色质中DNA序列的低甲基化
Plant Cell. 2008 Jan;20(1):25-34. doi: 10.1105/tpc.107.057083. Epub 2008 Jan 31.
7
The rapidly evolving centromere-specific histone has stringent functional requirements in Arabidopsis thaliana.快速进化的着丝粒特异性组蛋白在拟南芥中具有严格的功能要求。
Genetics. 2010 Oct;186(2):461-71. doi: 10.1534/genetics.110.120337. Epub 2010 Jul 13.
8
Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore.比较两种最新的基因组组装测序技术:太平洋生物科学测序仪二代系统的 HiFi 读取和牛津纳米孔的超长读取。
Gigascience. 2020 Dec 15;9(12). doi: 10.1093/gigascience/giaa123.
9
Prospects of telomere-to-telomere assembly in barley: Analysis of sequence gaps in the MorexV3 reference genome.大麦端粒到端粒组装的前景:MorexV3 参考基因组序列缺口分析。
Plant Biotechnol J. 2022 Jul;20(7):1373-1386. doi: 10.1111/pbi.13816. Epub 2022 Apr 7.
10
New insights into centromeres from Arabidopsis Col-CEN assembly.从拟南芥Col-CEN组装中获得的着丝粒新见解。
Trends Genet. 2022 May;38(5):416-418. doi: 10.1016/j.tig.2022.02.001. Epub 2022 Feb 15.

引用本文的文献

1
Chlorophyllide a Oxygenase (CAO) Gene Duplication Across the Viridiplantae.绿藻门植物中叶绿素酸酯a加氧酶(CAO)基因的复制
J Mol Evol. 2025 Sep 15. doi: 10.1007/s00239-025-10266-4.
2
Gymnosperm-specific CYP90Js enable biflavonoid biosynthesis and microbial production of amentoflavone.裸子植物特有的CYP90Js能够实现双黄酮生物合成以及穗花杉双黄酮的微生物生产。
Nat Commun. 2025 Aug 21;16(1):7792. doi: 10.1038/s41467-025-62990-6.
3
Microbial signatures and ARGs profiles in two coastal ecosystems of Shenzhen under distinct anthropogenic influences.

本文引用的文献

1
The complete sequence of a human genome.人类基因组的完整序列。
Science. 2022 Apr;376(6588):44-53. doi: 10.1126/science.abj6987. Epub 2022 Mar 31.
2
The genetic and epigenetic landscape of the centromeres.着丝粒的遗传和表观遗传景观。
Science. 2021 Nov 12;374(6569):eabi7489. doi: 10.1126/science.abi7489.
3
The Genome Sequence Archive Family: Toward Explosive Data Growth and Diverse Data Types.基因组序列档案家族:走向爆炸式的数据增长和多样化的数据类型。
深圳两个受不同人为影响的沿海生态系统中的微生物特征和抗生素抗性基因谱。
iScience. 2025 Jul 16;28(8):113133. doi: 10.1016/j.isci.2025.113133. eCollection 2025 Aug 15.
4
Topsicle: a method for estimating telomere length from whole genome long-read sequencing data.Topsicle:一种从全基因组长读测序数据估计端粒长度的方法。
bioRxiv. 2025 Jul 15:2025.07.10.664126. doi: 10.1101/2025.07.10.664126.
5
The evolutionary dynamics of organellar pan-genomes in Arabidopsis thaliana.拟南芥细胞器泛基因组的进化动力学
Genome Biol. 2025 Aug 11;26(1):240. doi: 10.1186/s13059-025-03717-0.
6
A gap-free reference genome of Populus deltoides provides insights into karyotype evolution of Salicaceae.美洲黑杨的无间隙参考基因组为杨柳科核型进化提供了见解。
BMC Biol. 2025 Jul 7;23(1):201. doi: 10.1186/s12915-025-02304-w.
7
Telomere-to-telomere genome assembly of linseed (Linum usitatissimum L.) for functional genomics and accelerated genetic improvement.亚麻(Linum usitatissimum L.)的端粒到端粒基因组组装用于功能基因组学和加速遗传改良。
Plant Biotechnol J. 2025 Jun 19. doi: 10.1111/pbi.70183.
8
Advancing ecological and evolutionary research in Arabidopsis: Extending insights into model and nonmodel plants.推进拟南芥的生态学和进化研究:拓展对模式植物和非模式植物的认识。
Plant Cell. 2025 Jul 1;37(7). doi: 10.1093/plcell/koaf151.
9
Colora: a Snakemake workflow for complete chromosome-scale de novo genome assembly.Colora:一种用于完整染色体水平从头基因组组装的Snakemake工作流程。
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf175.
10
A telomere-to-telomere genome assembly coupled with multi-omic data provides insights into the evolution of hexaploid bread wheat.端粒到端粒的基因组组装结合多组学数据为六倍体面包小麦的进化提供了见解。
Nat Genet. 2025 Apr;57(4):1008-1020. doi: 10.1038/s41588-025-02137-x. Epub 2025 Apr 7.
Genomics Proteomics Bioinformatics. 2021 Aug;19(4):578-583. doi: 10.1016/j.gpb.2021.08.001. Epub 2021 Aug 13.
4
Genome Warehouse: A Public Repository Housing Genome-scale Data.基因组仓库:一个存储基因组规模数据的公共存储库。
Genomics Proteomics Bioinformatics. 2021 Aug;19(4):584-589. doi: 10.1016/j.gpb.2021.04.001. Epub 2021 Jun 24.
5
Two gap-free reference genomes and a global view of the centromere architecture in rice.无间隙参考基因组揭示水稻着丝粒结构的整体特征。
Mol Plant. 2021 Oct 4;14(10):1757-1767. doi: 10.1016/j.molp.2021.06.018. Epub 2021 Jun 24.
6
The structure, function and evolution of a complete human chromosome 8.完整人类 8 号染色体的结构、功能与进化
Nature. 2021 May;593(7857):101-107. doi: 10.1038/s41586-021-03420-7. Epub 2021 Apr 7.
7
Anno genominis XX: 20 years of Arabidopsis genomics.XX 年记事:拟南芥基因组学 20 年。
Plant Cell. 2021 May 31;33(4):832-845. doi: 10.1093/plcell/koaa038.
8
Benchmarking of next and third generation sequencing technologies and their associated algorithms for genome assembly.对下一代和第三代测序技术及其相关算法进行基因组组装的基准测试。
Mol Med Rep. 2021 Apr;23(4). doi: 10.3892/mmr.2021.11890. Epub 2021 Feb 4.
9
Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm.使用带有 hifiasm 的相定装配图进行单体型解析从头组装。
Nat Methods. 2021 Feb;18(2):170-175. doi: 10.1038/s41592-020-01056-5. Epub 2021 Feb 1.
10
Liftoff: accurate mapping of gene annotations.发射:基因注释的精确映射。
Bioinformatics. 2021 Jul 19;37(12):1639-1643. doi: 10.1093/bioinformatics/btaa1016.