Suppr超能文献

从超长易错读中自动组装着丝粒。

Automated assembly of centromeres from ultra-long error-prone reads.

机构信息

Graduate Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA.

Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.

出版信息

Nat Biotechnol. 2020 Nov;38(11):1309-1316. doi: 10.1038/s41587-020-0582-4. Epub 2020 Jul 14.

Abstract

Centromeric variation has been linked to cancer and infertility, but centromere sequences contain multiple tandem repeats and can only be assembled manually from long error-prone reads. Here we describe the centroFlye algorithm for centromere assembly using long error-prone reads, and apply it to assemble human centromeres on chromosomes 6 and X. Our analyses reveal putative breakpoints in the manual reconstruction of the human X centromere, demonstrate that human X chromosome is partitioned into repeat subfamilies and provide initial insights into centromere evolution. We anticipate that centroFlye could be applied to automatically close remaining multimegabase gaps in the reference human genome.

摘要

着丝粒变异与癌症和不孕不育有关,但着丝粒序列包含多个串联重复序列,只能通过易错的长读段手动组装。在这里,我们描述了一种使用易错的长读段进行着丝粒组装的 centroFlye 算法,并将其应用于组装人类 6 号和 X 号染色体的着丝粒。我们的分析揭示了在手动重建人类 X 着丝粒时的潜在断点,表明人类 X 染色体被划分为重复亚家族,并为着丝粒进化提供了初步见解。我们预计 centroFlye 可以应用于自动填补人类参考基因组中剩余的多兆碱基缺口。

相似文献

1
Automated assembly of centromeres from ultra-long error-prone reads.从超长易错读中自动组装着丝粒。
Nat Biotechnol. 2020 Nov;38(11):1309-1316. doi: 10.1038/s41587-020-0582-4. Epub 2020 Jul 14.
3
Linear assembly of a human centromere on the Y chromosome.线性组装人类着丝粒于 Y 染色体上。
Nat Biotechnol. 2018 Apr;36(4):321-323. doi: 10.1038/nbt.4109. Epub 2018 Mar 19.
10
Complete genomic and epigenetic maps of human centromeres.人类着丝粒的完整基因组和表观基因组图谱。
Science. 2022 Apr;376(6588):eabl4178. doi: 10.1126/science.abl4178. Epub 2022 Apr 1.

引用本文的文献

4
Fast sequence alignment for centromeres with RaMA.使用RaMA对着丝粒进行快速序列比对。
Genome Res. 2025 May 2;35(5):1209-1218. doi: 10.1101/gr.279763.124.
7
Leveraging the power of long reads for targeted sequencing.利用长读长片段进行靶向测序。
Genome Res. 2024 Nov 20;34(11):1701-1718. doi: 10.1101/gr.279168.124.
8
Centromere Landscapes Resolved from Hundreds of Human Genomes.从数百个人类基因组解析出的着丝粒图谱
Genomics Proteomics Bioinformatics. 2024 Dec 3;22(5). doi: 10.1093/gpbjnl/qzae071.

本文引用的文献

1
Telomere-to-telomere assembly of a complete human X chromosome.端粒到端粒组装完整的人类 X 染色体。
Nature. 2020 Sep;585(7823):79-84. doi: 10.1038/s41586-020-2547-7. Epub 2020 Jul 14.
4
Fast and accurate long-read assembly with wtdbg2.使用 wtdbg2 实现快速准确的长读长序列组装。
Nat Methods. 2020 Feb;17(2):155-158. doi: 10.1038/s41592-019-0669-3. Epub 2019 Dec 9.
9
Assembly of long, error-prone reads using repeat graphs.使用重复图组装长的、易错的读取。
Nat Biotechnol. 2019 May;37(5):540-546. doi: 10.1038/s41587-019-0072-8. Epub 2019 Apr 1.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验