• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

选择最佳生物信息学工具和合适的参考以减少靶向测序数据中的比对错误。

Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data.

作者信息

Nodehi Hannane Mohammadi, Tabatabaiefar Mohammad Amin, Sehhati Mohammadreza

机构信息

Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.

Department of Medical Genetics, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.

出版信息

J Med Signals Sens. 2021 Jan 30;11(1):37-44. doi: 10.4103/jmss.JMSS_7_20. eCollection 2021 Jan-Mar.

DOI:10.4103/jmss.JMSS_7_20
PMID:34026589
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8043119/
Abstract

BACKGROUND

Careful design in the primary steps of a next-generation sequencing study is critical for obtaining successful results in downstream analysis.

METHODS

In this study, a framework is proposed to evaluate and improve the sequence mapping in targeted regions of the reference genome. In this regard, simulated short reads were produced from the coding regions of the human genome and mapped to a Customized Target-Based Reference (CTBR) by the alignment tools that have been introduced recently. The short reads produced by different sequencing technologies aligned to the standard genome and also CTBR with and without well-defined mutation types where the amount of unmapped and misaligned reads and runtime was measured for comparison.

RESULTS

The results showed that the mapping accuracy of the reads generated from Illumina Hiseq2500 using Stampy as the alignment tool whenever the CTBR was used as reference was significantly better than other evaluated pipelines. Using CTBR for alignment significantly decreased the mapping error in comparison to other expanded or more limited references. While intentional mutations were imported in the reads, Stampy showed the minimum error of 1.67% using CTBR. However, the lowest error obtained by stampy too using whole genome and one chromosome as references was 3.78% and 20%, respectively. Maximum and minimum misalignment errors were observed on chromosome Y and 20, respectively.

CONCLUSION

Therefore using the proposed framework in a clinical targeted sequencing study may lead to predict the error and improve the performance of variant calling regarding the genomic regions targeted in a clinical study.

摘要

背景

在新一代测序研究的初始步骤中进行精心设计对于在下游分析中获得成功结果至关重要。

方法

在本研究中,提出了一个框架来评估和改进参考基因组靶向区域中的序列比对。在这方面,从人类基因组的编码区域产生模拟短读段,并通过最近引入的比对工具将其比对到定制的基于靶标的参考序列(CTBR)上。由不同测序技术产生的短读段与标准基因组以及有无明确突变类型的CTBR进行比对,测量未比对和比对错误的读段数量以及运行时间以作比较。

结果

结果表明,无论何时将CTBR用作参考序列,使用Stampy作为比对工具从Illumina Hiseq2500产生的读段的比对准确性均明显优于其他评估流程。与其他扩展或更有限的参考序列相比,使用CTBR进行比对显著降低了比对错误。当在读段中引入有意突变时,使用CTBR时Stampy显示的最低错误率为1.67%。然而,Stampy使用全基因组和一条染色体作为参考序列时获得的最低错误率分别为3.78%和20%。分别在Y染色体和20号染色体上观察到最大和最小的比对错误。

结论

因此,在临床靶向测序研究中使用所提出的框架可能有助于预测错误并提高临床研究中靶向基因组区域的变异检测性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b364/8043119/2deda1038a14/JMSS-11-37-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b364/8043119/2cfecb89657b/JMSS-11-37-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b364/8043119/714e28252702/JMSS-11-37-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b364/8043119/c7f4aa6e0b98/JMSS-11-37-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b364/8043119/254b0bda0be1/JMSS-11-37-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b364/8043119/2deda1038a14/JMSS-11-37-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b364/8043119/2cfecb89657b/JMSS-11-37-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b364/8043119/714e28252702/JMSS-11-37-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b364/8043119/c7f4aa6e0b98/JMSS-11-37-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b364/8043119/254b0bda0be1/JMSS-11-37-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b364/8043119/2deda1038a14/JMSS-11-37-g006.jpg

相似文献

1
Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data.选择最佳生物信息学工具和合适的参考以减少靶向测序数据中的比对错误。
J Med Signals Sens. 2021 Jan 30;11(1):37-44. doi: 10.4103/jmss.JMSS_7_20. eCollection 2021 Jan-Mar.
2
Performance evaluation method for read mapping tool in clinical panel sequencing.临床Panel测序中读段比对工具的性能评估方法
Genes Genomics. 2018;40(2):189-197. doi: 10.1007/s13258-017-0621-9. Epub 2017 Nov 9.
3
Evaluation and assessment of read-mapping by multiple next-generation sequencing aligners based on genome-wide characteristics.基于全基因组特征,对多种新一代测序比对器的读段比对进行评估。
Genomics. 2017 Jul;109(3-4):186-191. doi: 10.1016/j.ygeno.2017.03.001. Epub 2017 Mar 9.
4
Recalibration of mapping quality scores in Illumina short-read alignments improves SNP detection results in low-coverage sequencing data.重新校准Illumina短读长比对中的映射质量分数可改善低覆盖度测序数据中的单核苷酸多态性(SNP)检测结果。
PeerJ. 2020 Dec 7;8:e10501. doi: 10.7717/peerj.10501. eCollection 2020.
5
Lerna: transformer architectures for configuring error correction tools for short- and long-read genome sequencing.Lerna:用于配置短读和长读基因组测序错误纠正工具的变压器架构。
BMC Bioinformatics. 2022 Jan 6;23(1):25. doi: 10.1186/s12859-021-04547-0.
6
Identifying, understanding, and correcting technical artifacts on the sex chromosomes in next-generation sequencing data.鉴定、理解和纠正下一代测序数据中性染色体上的技术伪影。
Gigascience. 2019 Jul 1;8(7). doi: 10.1093/gigascience/giz074.
7
Coval: improving alignment quality and variant calling accuracy for next-generation sequencing data.Coval:提高下一代测序数据的比对质量和变异调用准确性。
PLoS One. 2013 Oct 8;8(10):e75402. doi: 10.1371/journal.pone.0075402. eCollection 2013.
8
UNDR ROVER - a fast and accurate variant caller for targeted DNA sequencing.UNDR ROVER——一种用于靶向DNA测序的快速且准确的变异检测工具。
BMC Bioinformatics. 2016 Apr 16;17:165. doi: 10.1186/s12859-016-1014-9.
9
Gencore: an efficient tool to generate consensus reads for error suppressing and duplicate removing of NGS data.Gencore:一种高效的工具,用于生成共识读数,以抑制 NGS 数据的错误并去除重复。
BMC Bioinformatics. 2019 Dec 27;20(Suppl 23):606. doi: 10.1186/s12859-019-3280-9.
10
Efficient frequency-based de novo short-read clustering for error trimming in next-generation sequencing.用于下一代测序中错误校正的基于频率的高效从头短读聚类
Genome Res. 2009 Jul;19(7):1309-15. doi: 10.1101/gr.089151.108. Epub 2009 May 13.

引用本文的文献

1
Machine learning on alignment features for parent-of-origin classification of simulated hybrid RNA-seq.基于比对特征的机器学习方法用于模拟杂交 RNA-seq 的亲本来源分类。
BMC Bioinformatics. 2024 Mar 12;25(1):109. doi: 10.1186/s12859-024-05728-3.

本文引用的文献

1
Performance evaluation method for read mapping tool in clinical panel sequencing.临床Panel测序中读段比对工具的性能评估方法
Genes Genomics. 2018;40(2):189-197. doi: 10.1007/s13258-017-0621-9. Epub 2017 Nov 9.
2
Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation.共识编码序列(CCDS)数据库:一组由专家管理的标准化人类和小鼠蛋白编码区。
Nucleic Acids Res. 2018 Jan 4;46(D1):D221-D228. doi: 10.1093/nar/gkx1031.
3
Reference standards for next-generation sequencing.
下一代测序的参考标准。
Nat Rev Genet. 2017 Aug;18(8):473-484. doi: 10.1038/nrg.2017.44. Epub 2017 Jun 19.
4
Kart: a divide-and-conquer algorithm for NGS read alignment.Kart:一种用于二代测序读段比对的分治算法。
Bioinformatics. 2017 Aug 1;33(15):2281-2287. doi: 10.1093/bioinformatics/btx189.
5
Evaluation and assessment of read-mapping by multiple next-generation sequencing aligners based on genome-wide characteristics.基于全基因组特征,对多种新一代测序比对器的读段比对进行评估。
Genomics. 2017 Jul;109(3-4):186-191. doi: 10.1016/j.ygeno.2017.03.001. Epub 2017 Mar 9.
6
A comparison of tools for the simulation of genomic next-generation sequencing data.用于模拟基因组下一代测序数据的工具比较。
Nat Rev Genet. 2016 Aug;17(8):459-69. doi: 10.1038/nrg.2016.57. Epub 2016 Jun 20.
7
Review of alignment and SNP calling algorithms for next-generation sequencing data.下一代测序数据的比对和单核苷酸多态性(SNP)检测算法综述。
J Appl Genet. 2016 Feb;57(1):71-9. doi: 10.1007/s13353-015-0292-7. Epub 2015 Jun 9.
8
Gene-panel sequencing and the prediction of breast-cancer risk.基因panel测序与乳腺癌风险预测
N Engl J Med. 2015 Jun 4;372(23):2243-57. doi: 10.1056/NEJMsr1501341. Epub 2015 May 27.
9
Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT): A Hybridization Capture-Based Next-Generation Sequencing Clinical Assay for Solid Tumor Molecular Oncology.纪念斯隆凯特琳癌症中心可操作癌症靶点综合突变分析(MSK-IMPACT):一种基于杂交捕获的实体瘤分子肿瘤学新一代测序临床检测方法。
J Mol Diagn. 2015 May;17(3):251-64. doi: 10.1016/j.jmoldx.2014.12.006. Epub 2015 Mar 20.
10
Whole-Exome Enrichment with the Agilent SureSelect Human All Exon Platform.使用安捷伦SureSelect人类全外显子平台进行全外显子富集。
Cold Spring Harb Protoc. 2015 Mar 11;2015(7):626-33. doi: 10.1101/pdb.prot083659.