利用短读长下一代 DNA 测序对难测序区域进行外显子组基准测试。

Exome-wide benchmark of difficult-to-sequence regions using short-read next-generation DNA sequencing.

机构信息

Laboratory of Computational Genomics, School of Life Sciences, Tokyo University of Pharmacy and Life Sciences, Hachioji, Tokyo 192-0392, Japan.

Division of Bioinformatics, Medical Institute of Bioregulation, Kyushu University, Higashi-ku, Fukuoka 812-8582, Japan.

出版信息

Nucleic Acids Res. 2024 Jan 11;52(1):114-124. doi: 10.1093/nar/gkad1140.

DOI:10.1093/nar/gkad1140

PMID:38015437

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10783491/

Abstract

Next-generation DNA sequencing (NGS) in short-read mode has recently been used for genetic testing in various clinical settings. NGS data accuracy is crucial in clinical settings, and several reports regarding quality control of NGS data, primarily focusing on establishing NGS sequence read accuracy, have been published thus far. Variant calling is another critical source of NGS errors that remains unexplored at the single-nucleotide level despite its established significance. In this study, we used a machine-learning-based method to establish an exome-wide benchmark of difficult-to-sequence regions at the nucleotide-residue resolution using 10 genome sequence features based on real-world NGS data accumulated in The Genome Aggregation Database (gnomAD) of the human reference genome sequence (GRCh38/hg38). The newly acquired metric, designated the 'UNMET score,' along with additional lines of structural information from the human genome, allowed us to assess the sequencing challenges within the exonic region of interest using conventional short-read NGS. Thus, the UNMET score could provide a basis for addressing potential sequential errors in protein-coding exons of the human reference genome sequence GRCh38/hg38 in clinical sequencing.

摘要

下一代短读长测序（NGS）技术最近已被广泛应用于各种临床环境中的基因检测。在临床环境中，NGS 数据的准确性至关重要，迄今为止已经发表了多项关于 NGS 数据质量控制的报告，主要集中在建立 NGS 序列读取准确性方面。尽管变异调用在单核苷酸水平上的重要性已得到证实，但它仍然是另一个 NGS 错误的重要来源，尚未得到探索。在这项研究中，我们使用基于机器学习的方法，使用基于真实世界 NGS 数据的 10 个基因组特征，在核苷酸残基分辨率上建立了外显子范围的难以测序区域的基准，这些数据是基于人类参考基因组序列（GRCh38/hg38）的基因组聚集数据库（gnomAD）积累的。新获得的指标，命名为“UNMET 得分”，以及来自人类基因组的额外结构信息，使我们能够使用常规的短读 NGS 评估感兴趣的外显子区域中的测序挑战。因此，UNMET 得分可以为解决临床测序中人类参考基因组序列 GRCh38/hg38 中蛋白质编码外显子的潜在序列错误提供依据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b561/10783491/fe0e443f53b9/gkad1140figgra1.jpg

相似文献

Exome-wide benchmark of difficult-to-sequence regions using short-read next-generation DNA sequencing.利用短读长下一代 DNA 测序对难测序区域进行外显子组基准测试。

Nucleic Acids Res. 2024 Jan 11;52(1):114-124. doi: 10.1093/nar/gkad1140.

Factors Affecting Migration to GRCh38 in Laboratories Performing Clinical Next-Generation Sequencing.影响临床下一代测序实验室迁移到 GRCh38 的因素。

J Mol Diagn. 2021 May;23(5):651-657. doi: 10.1016/j.jmoldx.2021.02.003. Epub 2021 Feb 22.

Opportunities and challenges of whole-genome and -exome sequencing.全基因组和外显子组测序的机遇与挑战

BMC Genet. 2017 Feb 14;18(1):14. doi: 10.1186/s12863-017-0479-5.

Evaluation of copy number variant detection from panel-based next-generation sequencing data.基于面板的下一代测序数据中拷贝数变异检测的评估

Mol Genet Genomic Med. 2019 Jan;7(1):e00513. doi: 10.1002/mgg3.513. Epub 2018 Nov 22.

Performance comparison: exome sequencing as a single test replacing Sanger sequencing.性能比较：外显子组测序作为替代桑格测序的单一检测方法。

Mol Genet Genomics. 2021 May;296(3):653-663. doi: 10.1007/s00438-021-01772-3. Epub 2021 Mar 11.

Identifying, understanding, and correcting technical artifacts on the sex chromosomes in next-generation sequencing data.鉴定、理解和纠正下一代测序数据中性染色体上的技术伪影。

Gigascience. 2019 Jul 1;8(7). doi: 10.1093/gigascience/giz074.

Performance assessment of DNA sequencing platforms in the ABRF Next-Generation Sequencing Study.ABRF 下一代测序研究中 DNA 测序平台的性能评估。

Nat Biotechnol. 2021 Sep;39(9):1129-1140. doi: 10.1038/s41587-021-01049-5. Epub 2021 Sep 9.

Using synthetic chromosome controls to evaluate the sequencing of difficult regions within the human genome.使用合成染色体对照来评估人类基因组中困难区域的测序。

Genome Biol. 2022 Jan 12;23(1):19. doi: 10.1186/s13059-021-02579-6.

Systematic Evaluation of Sanger Validation of Next-Generation Sequencing Variants.下一代测序变异的桑格验证的系统评价

Clin Chem. 2016 Apr;62(4):647-54. doi: 10.1373/clinchem.2015.249623. Epub 2016 Feb 4.

Aligning to the sample-specific reference sequence to optimize the accuracy of next-generation sequencing analysis for hepatitis B virus.与样本特异性参考序列比对以优化乙型肝炎病毒下一代测序分析的准确性。

Hepatol Int. 2016 Jan;10(1):147-57. doi: 10.1007/s12072-015-9645-x. Epub 2015 Jul 25.

引用本文的文献

Construction of a Genome-Wide Copy Number Variation Map and Association Analysis of Black Spot in Jujube.枣全基因组拷贝数变异图谱构建及黑斑病关联分析

Plants (Basel). 2025 Sep 5;14(17):2782. doi: 10.3390/plants14172782.

Comparative evaluation of four exome enrichment solutions in 2024: Agilent, Roche, Vazyme and Nanodigmbio.2024年四种外显子组富集解决方案的比较评估：安捷伦、罗氏、诺唯赞和纳昂达生物

BMC Genomics. 2025 Jan 27;26(1):76. doi: 10.1186/s12864-024-11196-z.

Importance of EQA/PT for the detection of genetic variants in comprehensive cancer genome testing.室间质量评价/能力验证在全面癌症基因组检测中对基因变异检测的重要性。

Sci Rep. 2025 Jan 7;15(1):1036. doi: 10.1038/s41598-024-84714-4.

High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation.对 1000 基因组计划样本进行高覆盖度的纳米孔测序，构建人类遗传变异综合目录。

Genome Res. 2024 Nov 20;34(11):2061-2073. doi: 10.1101/gr.279273.124.

Japanese Public Health Insurance System's new genomic strategic action to shorten the "diagnostic odyssey" for patients with rare and intractable diseases.日本公共医疗保险系统的新基因组战略行动，旨在缩短罕见和疑难疾病患者的“诊断困境”。

J Hum Genet. 2024 Nov;69(11):549-552. doi: 10.1038/s10038-024-01285-y. Epub 2024 Aug 15.

Nanopore sequencing of 1000 Genomes Project samples to build a comprehensive catalog of human genetic variation.对千人基因组计划样本进行纳米孔测序，以构建人类遗传变异的综合目录。

medRxiv. 2024 Mar 7:2024.03.05.24303792. doi: 10.1101/2024.03.05.24303792.

本文引用的文献

Characterization of Reference Materials for TPMT and NUDT15: A GeT-RM Collaborative Project.用于 TPMT 和 NUDT15 的参考物质的特征描述：一个 GeT-RM 合作项目。

J Mol Diagn. 2022 Oct;24(10):1079-1088. doi: 10.1016/j.jmoldx.2022.06.008. Epub 2022 Aug 2.

Curated variation benchmarks for challenging medically relevant autosomal genes.针对具有挑战性的医学相关常染色体基因的精选变异基准。

Nat Biotechnol. 2022 May;40(5):672-680. doi: 10.1038/s41587-021-01158-1. Epub 2022 Feb 7.

Exome variant discrepancies due to reference-genome differences.外显子变异差异归因于参考基因组差异。

Am J Hum Genet. 2021 Jul 1;108(7):1239-1250. doi: 10.1016/j.ajhg.2021.05.011. Epub 2021 Jun 14.

Next-generation sequencing for constitutional variants in the clinical laboratory, 2021 revision: a technical standard of the American College of Medical Genetics and Genomics (ACMG).临床实验室染色体结构变异的新一代测序检测：2021 年修订版：美国医学遗传学与基因组学学会（ACMG）的技术标准。

Genet Med. 2021 Aug;23(8):1399-1415. doi: 10.1038/s41436-021-01139-4. Epub 2021 Apr 29.

Best practices for variant calling in clinical sequencing.临床测序中变异调用的最佳实践。

Genome Med. 2020 Oct 26;12(1):91. doi: 10.1186/s13073-020-00791-w.

Transcript expression-aware annotation improves rare variant interpretation.转录本表达感知注释可提高罕见变异的解读。

Nature. 2020 May;581(7809):452-458. doi: 10.1038/s41586-020-2329-2. Epub 2020 May 27.

The mutational constraint spectrum quantified from variation in 141,456 humans.从 141456 名人类个体的变异中量化的突变约束谱。

Nature. 2020 May;581(7809):434-443. doi: 10.1038/s41586-020-2308-7. Epub 2020 May 27.

GenMap: ultra-fast computation of genome mappability.GenMap：快速计算基因组可映射性。

Bioinformatics. 2020 Jun 1;36(12):3687-3692. doi: 10.1093/bioinformatics/btaa222.

The ENCODE Blacklist: Identification of Problematic Regions of the Genome.ENCODE 黑名单：基因组中问题区域的鉴定。

Sci Rep. 2019 Jun 27;9(1):9354. doi: 10.1038/s41598-019-45839-z.

An open resource for accurately benchmarking small variant and reference calls.用于准确基准测试小型变体和参考调用的开放资源。

Nat Biotechnol. 2019 May;37(5):561-566. doi: 10.1038/s41587-019-0074-6. Epub 2019 Apr 1.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用短读长下一代 DNA 测序对难测序区域进行外显子组基准测试。

Exome-wide benchmark of difficult-to-sequence regions using short-read next-generation DNA sequencing.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献