• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

分析机器学习算法作为下一代测序数据验证的综合工具。

Analysis of machine learning algorithms as integrative tools for validation of next generation sequencing data.

机构信息

MAGI Euregio, Bolzano,

出版信息

Eur Rev Med Pharmacol Sci. 2019 Sep;23(18):8139-8147. doi: 10.26355/eurrev_201909_19034.

DOI:10.26355/eurrev_201909_19034
PMID:31599443
Abstract

OBJECTIVE

While next generation sequencing (NGS) has become the technology of choice for clinical diagnostics, most genetic laboratories still use Sanger sequencing for orthogonal confirmation of NGS results. Previous studies have shown that when the quality of NGS data is high, most calls are indicated by Sanger sequencing, making confirmation redundant. We aimed at establishing a set of criteria that make it possible to distinguish NGS calls that need orthogonal confirmation from those that do not would significantly decrease the amount of work necessary to reach a diagnosis.

MATERIALS AND METHODS

A data set of 7976 NGS calls confirmed as true or false positive by Sanger sequencing was used to train and test different machine learning (ML) approaches. By varying the size and class balance of the training dataset, we measured the performance of the different algorithms to determine the conditions under which ML is a valid approach for confirming NGS calls in a diagnostic environment.

RESULTS

Our results indicate that machine learning is a valid approach to find variant calls that need more investigation, but in order to reach the high accuracy required in a clinical environment, the training data set must include enough observations and these observations must be well-balanced between true/false positive NGS calls.

CONCLUSIONS

Our results show that it is possible to integrate the diagnostic NGS validation workflow with a machine learning approach to reduce the number of Sanger confirmations of high- quality NGS calls, reducing the time and costs of diagnosis.

摘要

目的

虽然下一代测序(NGS)已成为临床诊断的首选技术,但大多数遗传实验室仍使用 Sanger 测序对 NGS 结果进行正交确认。先前的研究表明,当 NGS 数据质量较高时,Sanger 测序可指示大多数检测结果,从而使确认工作变得多余。我们旨在建立一套标准,使我们能够区分需要正交确认的 NGS 检测结果和不需要的检测结果,这将大大减少获得诊断所需的工作量。

材料与方法

使用一组由 Sanger 测序证实为阳性或阴性的 7976 个 NGS 检测结果的数据来训练和测试不同的机器学习(ML)方法。通过改变训练数据集的大小和类别平衡,我们测量了不同算法的性能,以确定在何种条件下,机器学习是一种在诊断环境中确认 NGS 检测结果的有效方法。

结果

我们的结果表明,机器学习是一种有效的方法,可以找到需要进一步调查的变异检测结果,但为了达到临床环境所需的高精度,训练数据集必须包含足够的观测值,并且这些观测值必须在真阳性/假阳性 NGS 检测结果之间保持良好的平衡。

结论

我们的结果表明,将诊断性 NGS 验证工作流程与机器学习方法相结合是可行的,可以减少高质量 NGS 检测结果的 Sanger 确认次数,从而缩短诊断时间并降低成本。

相似文献

1
Analysis of machine learning algorithms as integrative tools for validation of next generation sequencing data.分析机器学习算法作为下一代测序数据验证的综合工具。
Eur Rev Med Pharmacol Sci. 2019 Sep;23(18):8139-8147. doi: 10.26355/eurrev_201909_19034.
2
A machine learning model to determine the accuracy of variant calls in capture-based next generation sequencing.基于捕获的下一代测序中变异调用准确性的机器学习模型。
BMC Genomics. 2018 Apr 17;19(1):263. doi: 10.1186/s12864-018-4659-0.
3
Machine learning random forest for predicting oncosomatic variant NGS analysis.机器学习随机森林预测肿瘤体细胞变异 NGS 分析。
Sci Rep. 2021 Nov 8;11(1):21820. doi: 10.1038/s41598-021-01253-y.
4
Sanger Confirmation Is Required to Achieve Optimal Sensitivity and Specificity in Next-Generation Sequencing Panel Testing.在新一代测序 panel 检测中,需要进行桑格验证以实现最佳的灵敏度和特异性。
J Mol Diagn. 2016 Nov;18(6):923-932. doi: 10.1016/j.jmoldx.2016.07.006. Epub 2016 Oct 6.
5
SNooPer: a machine learning-based method for somatic variant identification from low-pass next-generation sequencing.SNooPer:一种基于机器学习从低深度下一代测序中识别体细胞变异的方法。
BMC Genomics. 2016 Nov 14;17(1):912. doi: 10.1186/s12864-016-3281-2.
6
Systematic Evaluation of Sanger Validation of Next-Generation Sequencing Variants.下一代测序变异的桑格验证的系统评价
Clin Chem. 2016 Apr;62(4):647-54. doi: 10.1373/clinchem.2015.249623. Epub 2016 Feb 4.
7
Software-Assisted Manual Review of Clinical Next-Generation Sequencing Data: An Alternative to Routine Sanger Sequencing Confirmation with Equivalent Results in >15,000 Germline DNA Screens.临床二代测序数据的软件辅助人工审核:常规桑格测序确认的替代方法,在超过15,000次种系DNA筛查中结果等效
J Mol Diagn. 2019 Mar;21(2):296-306. doi: 10.1016/j.jmoldx.2018.10.002. Epub 2018 Dec 4.
8
A Rigorous Interlaboratory Examination of the Need to Confirm Next-Generation Sequencing-Detected Variants with an Orthogonal Method in Clinical Genetic Testing.临床基因检测中采用正交方法确认下一代测序检测到的变异体必要性的严格实验室间检验
J Mol Diagn. 2019 Mar;21(2):318-329. doi: 10.1016/j.jmoldx.2018.10.009. Epub 2019 Jan 3.
9
Confirming Variants in Next-Generation Sequencing Panel Testing by Sanger Sequencing.通过桑格测序法确认下一代测序基因panel检测中的变异体
J Mol Diagn. 2015 Jul;17(4):456-61. doi: 10.1016/j.jmoldx.2015.03.004. Epub 2015 May 8.
10
A comprehensive assessment of Next-Generation Sequencing variants validation using a secondary technology.利用辅助技术对下一代测序变异进行全面评估。
Mol Genet Genomic Med. 2019 Jul;7(7):e00748. doi: 10.1002/mgg3.748. Epub 2019 Jun 4.

引用本文的文献

1
Determination of high-confidence germline genetic variants in next-generation sequencing through machine learning models: an approach to reduce the burden of orthogonal confirmation.通过机器学习模型确定下一代测序中的高可信度种系遗传变异:一种减轻正交确认负担的方法。
BMC Genomics. 2025 Aug 6;26(1):728. doi: 10.1186/s12864-025-11889-z.
2
Towards a Long-Read Sequencing Approach for the Molecular Diagnosis of RPGR Genetic Variants.针对 RPGR 基因突变的长读测序分子诊断方法。
Int J Mol Sci. 2023 Nov 28;24(23):16881. doi: 10.3390/ijms242316881.
3
Next-Generation Sequencing of a Large Gene Panel for Outcome Prediction of Bariatric Surgery in Patients with Severe Obesity.
用于预测重度肥胖患者减肥手术结局的大基因panel的下一代测序
J Clin Med. 2022 Dec 19;11(24):7531. doi: 10.3390/jcm11247531.
4
Machine learning random forest for predicting oncosomatic variant NGS analysis.机器学习随机森林预测肿瘤体细胞变异 NGS 分析。
Sci Rep. 2021 Nov 8;11(1):21820. doi: 10.1038/s41598-021-01253-y.
5
Male Infertility Diagnosis: Improvement of Genetic Analysis Performance by the Introduction of Pre-Diagnostic Genes in a Next-Generation Sequencing Custom-Made Panel.男性不育症诊断:通过在下一代测序定制面板中引入预诊断基因来提高遗传分析性能。
Front Endocrinol (Lausanne). 2021 Jan 26;11:605237. doi: 10.3389/fendo.2020.605237. eCollection 2020.
6
appMAGI: A complete laboratory information management system for clinical diagnostics.appMAGI:一款用于临床诊断的完整实验室信息管理系统。
Acta Biomed. 2020 Nov 9;91(13-S):e2020015. doi: 10.23750/abm.v91i13-S.10521.
7
Clinical Evaluation of a Custom Gene Panel as a Tool for Precision Male Infertility Diagnosis by Next-Generation Sequencing.定制基因检测板作为下一代测序技术用于精准男性不育诊断工具的临床评估
Life (Basel). 2020 Oct 15;10(10):242. doi: 10.3390/life10100242.