推进精准医学中个人基因组组装的挑战、解决方案及质量指标

Challenges, Solutions, and Quality Metrics of Personal Genome Assembly in Advancing Precision Medicine.

作者信息

Xiao Wenming, Wu Leihong, Yavas Gokhan, Simonyan Vahan, Ning Baitang, Hong Huixiao

机构信息

National Center for Toxicological Research, U.S. Food and Drug Administration, 3900 NCTR Road, Jefferson, AR 72079, USA.

Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, 10903 New Hampshire Ave, Silver Spring, MD 20993, USA.

出版信息

Pharmaceutics. 2016 Apr 22;8(2):15. doi: 10.3390/pharmaceutics8020015.

DOI:10.3390/pharmaceutics8020015

PMID:27110816

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4932478/

Abstract

Even though each of us shares more than 99% of the DNA sequences in our genome, there are millions of sequence codes or structure in small regions that differ between individuals, giving us different characteristics of appearance or responsiveness to medical treatments. Currently, genetic variants in diseased tissues, such as tumors, are uncovered by exploring the differences between the reference genome and the sequences detected in the diseased tissue. However, the public reference genome was derived with the DNA from multiple individuals. As a result of this, the reference genome is incomplete and may misrepresent the sequence variants of the general population. The more reliable solution is to compare sequences of diseased tissue with its own genome sequence derived from tissue in a normal state. As the price to sequence the human genome has dropped dramatically to around $1000, it shows a promising future of documenting the personal genome for every individual. However, de novo assembly of individual genomes at an affordable cost is still challenging. Thus, till now, only a few human genomes have been fully assembled. In this review, we introduce the history of human genome sequencing and the evolution of sequencing platforms, from Sanger sequencing to emerging "third generation sequencing" technologies. We present the currently available de novo assembly and post-assembly software packages for human genome assembly and their requirements for computational infrastructures. We recommend that a combined hybrid assembly with long and short reads would be a promising way to generate good quality human genome assemblies and specify parameters for the quality assessment of assembly outcomes. We provide a perspective view of the benefit of using personal genomes as references and suggestions for obtaining a quality personal genome. Finally, we discuss the usage of the personal genome in aiding vaccine design and development, monitoring host immune-response, tailoring drug therapy and detecting tumors. We believe the precision medicine would largely benefit from bioinformatics solutions, particularly for personal genome assembly.

摘要

尽管我们每个人基因组中的DNA序列有超过99%是相同的，但在小区域内仍有数百万个序列编码或结构存在个体差异，这赋予了我们不同的外貌特征或对医学治疗的反应。目前，通过探索参考基因组与患病组织中检测到的序列之间的差异，可发现患病组织（如肿瘤）中的基因变异。然而，公共参考基因组是由多个个体的DNA推导而来的。因此，参考基因组并不完整，可能会错误呈现普通人群的序列变异。更可靠的解决方案是将患病组织的序列与其正常状态下组织的自身基因组序列进行比较。随着人类基因组测序成本大幅降至约1000美元，为每个人记录个人基因组展现出了光明的前景。然而，以可承受的成本进行个体基因组的从头组装仍然具有挑战性。因此，到目前为止，只有少数人类基因组被完全组装。在这篇综述中，我们介绍了人类基因组测序的历史以及测序平台的演变，从桑格测序到新兴的“第三代测序”技术。我们展示了目前可用于人类基因组组装的从头组装和组装后软件包，以及它们对计算基础设施的要求。我们建议，结合长读长和短读长的混合组装是生成高质量人类基因组组装的一种有前景的方法，并指定了组装结果质量评估的参数。我们提供了使用个人基因组作为参考的好处的观点以及获得高质量个人基因组的建议。最后，我们讨论了个人基因组在辅助疫苗设计与开发、监测宿主免疫反应、定制药物治疗和检测肿瘤方面的应用。我们相信精准医学将在很大程度上受益于生物信息学解决方案，特别是在个人基因组组装方面。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64d6/4932478/e03fa678707e/pharmaceutics-08-00015-g001.jpg

相似文献

Challenges, Solutions, and Quality Metrics of Personal Genome Assembly in Advancing Precision Medicine.

Pharmaceutics. 2016 Apr 22;8(2):15. doi: 10.3390/pharmaceutics8020015.

VGEA: an RNA viral assembly toolkit.

PeerJ. 2021 Sep 6;9:e12129. doi: 10.7717/peerj.12129. eCollection 2021.

Software for pre-processing Illumina next-generation sequencing short read sequences.

Source Code Biol Med. 2014 May 3;9:8. doi: 10.1186/1751-0473-9-8. eCollection 2014.

SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome.

BMC Bioinformatics. 2015 Sep 16;16(1):295. doi: 10.1186/s12859-015-0726-6.

Benchmarking hybrid assemblies of Giardia and prediction of widespread intra-isolate structural variation.

Parasit Vectors. 2020 Feb 28;13(1):108. doi: 10.1186/s13071-020-3968-8.

De novo likelihood-based measures for comparing genome assemblies.

BMC Res Notes. 2013 Aug 22;6:334. doi: 10.1186/1756-0500-6-334.

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies.

J Vis Exp. 2021 Aug 20(174). doi: 10.3791/62872.

The fast changing landscape of sequencing technologies and their impact on microbial genome assemblies and annotation.

PLoS One. 2012;7(12):e48837. doi: 10.1371/journal.pone.0048837. Epub 2012 Dec 12.

Rapid Low-Cost Assembly of the Reference Genome Using Low-Coverage, Long-Read Sequencing.

G3 (Bethesda). 2018 Oct 3;8(10):3143-3154. doi: 10.1534/g3.118.200162.

High quality 3C de novo assembly and annotation of a multidrug resistant ST-111 Pseudomonas aeruginosa genome: Benchmark of hybrid and non-hybrid assemblers.

Sci Rep. 2020 Jan 29;10(1):1392. doi: 10.1038/s41598-020-58319-6.

引用本文的文献

Microsatellite instability assessment is instrumental for Predictive, Preventive and Personalised Medicine: status quo and outlook.

EPMA J. 2023 Jan 25;14(1):143-165. doi: 10.1007/s13167-023-00312-w. eCollection 2023 Mar.

Personalized genome assembly for accurate cancer somatic mutation discovery using tumor-normal paired reference samples.

Genome Biol. 2022 Nov 9;23(1):237. doi: 10.1186/s13059-022-02803-x.

SAUTE: sequence assembly using target enrichment.

BMC Bioinformatics. 2021 Jul 21;22(1):375. doi: 10.1186/s12859-021-04174-9.

Nanomaterial Databases: Data Sources for Promoting Design and Risk Assessment of Nanomaterials.

Nanomaterials (Basel). 2021 Jun 18;11(6):1599. doi: 10.3390/nano11061599.

dnAQET: a framework to compute a consolidated metric for benchmarking quality of de novo assemblies.

BMC Genomics. 2019 Sep 11;20(1):706. doi: 10.1186/s12864-019-6070-x.

SKESA: strategic k-mer extension for scrupulous assemblies.

Genome Biol. 2018 Oct 4;19(1):153. doi: 10.1186/s13059-018-1540-z.

Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches.

Sci Rep. 2017 Sep 8;7(1):10963. doi: 10.1038/s41598-017-10826-9.

Snake Genome Sequencing: Results and Future Prospects.

Toxins (Basel). 2016 Dec 1;8(12):360. doi: 10.3390/toxins8120360.

本文引用的文献

Extensive sequencing of seven human genomes to characterize benchmark reference materials.

Sci Data. 2016 Jun 7;3:160025. doi: 10.1038/sdata.2016.25.

BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.

Bioinformatics. 2016 Mar 1;32(5):767-9. doi: 10.1093/bioinformatics/btv661. Epub 2015 Nov 11.

Oxford Nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome.

Genome Res. 2015 Nov;25(11):1750-6. doi: 10.1101/gr.191395.115. Epub 2015 Oct 7.

Genetic variation and the de novo assembly of human genomes.

Nat Rev Genet. 2015 Nov;16(11):627-40. doi: 10.1038/nrg3933. Epub 2015 Oct 7.

Metassembler: merging and optimizing de novo genome assemblies.

Genome Biol. 2015 Sep 24;16:207. doi: 10.1186/s13059-015-0764-4.

Whole genome?

Nat Genet. 2015 Sep;47(9):963. doi: 10.1038/ng.3397.

Performance of exome sequencing for pharmacogenomics.

Per Med. 2014;12(2):109-115. doi: 10.2217/PME.14.77.

Assembly and diploid architecture of an individual human genome via single-molecule technologies.

Nat Methods. 2015 Aug;12(8):780-6. doi: 10.1038/nmeth.3454. Epub 2015 Jun 29.

Beyond the reference genome.

Nat Biotechnol. 2015 Jun;33(6):605-6. doi: 10.1038/nbt.3249.

De novo assembly of a haplotype-resolved human genome.

Nat Biotechnol. 2015 Jun;33(6):617-22. doi: 10.1038/nbt.3200. Epub 2015 May 25.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

推进精准医学中个人基因组组装的挑战、解决方案及质量指标

Challenges, Solutions, and Quality Metrics of Personal Genome Assembly in Advancing Precision Medicine.

作者信息

Xiao Wenming, Wu Leihong, Yavas Gokhan, Simonyan Vahan, Ning Baitang, Hong Huixiao

机构信息

National Center for Toxicological Research, U.S. Food and Drug Administration, 3900 NCTR Road, Jefferson, AR 72079, USA.

Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, 10903 New Hampshire Ave, Silver Spring, MD 20993, USA.

出版信息

Pharmaceutics. 2016 Apr 22;8(2):15. doi: 10.3390/pharmaceutics8020015.

DOI:10.3390/pharmaceutics8020015

PMID:27110816

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4932478/

Abstract

摘要

推进精准医学中个人基因组组装的挑战、解决方案及质量指标

Challenges, Solutions, and Quality Metrics of Personal Genome Assembly in Advancing Precision Medicine.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

推进精准医学中个人基因组组装的挑战、解决方案及质量指标

Challenges, Solutions, and Quality Metrics of Personal Genome Assembly in Advancing Precision Medicine.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献