Duncavage Eric J, Coleman Joshua F, de Baca Monica E, Kadri Sabah, Leon Annette, Routbort Mark, Roy Somak, Suarez Carlos J, Vanderbilt Chad, Zook Justin M
In Silico Pipeline Validation Working Group of the Clinical Practice Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri.
In Silico Pipeline Validation Working Group of the Clinical Practice Committee, Association for Molecular Pathology, Rockville, Maryland; Department of Pathology, University of Utah, Salt Lake City, Utah.
J Mol Diagn. 2023 Jan;25(1):3-16. doi: 10.1016/j.jmoldx.2022.09.007. Epub 2022 Oct 13.
In silico approaches for next-generation sequencing (NGS) data modeling have utility in the clinical laboratory as a tool for clinical assay validation. In silico NGS data can take a variety of forms, including pure simulated data or manipulated data files in which variants are inserted into existing data files. In silico data enable simulation of a range of variants that may be difficult to obtain from a single physical sample. Such data allow laboratories to more accurately test the performance of clinical bioinformatics pipelines without sequencing additional cases. For example, clinical laboratories may use in silico data to simulate low variant allele fraction variants to test the analytical sensitivity of variant calling software or simulate a range of insertion/deletion sizes to determine the performance of insertion/deletion calling software. In this article, the Working Group reviews the different types of in silico data with their strengths and limitations, methods to generate in silico data, and how data can be used in the clinical molecular diagnostic laboratory. Survey data indicate how in silico NGS data are currently being used. Finally, potential applications for which in silico data may become useful in the future are presented.
用于下一代测序(NGS)数据建模的计算机模拟方法在临床实验室中具有实用价值,可作为临床检测验证的工具。计算机模拟NGS数据可以有多种形式,包括纯模拟数据或经过处理的数据文件,其中变异被插入到现有数据文件中。计算机模拟数据能够模拟一系列可能难以从单个物理样本中获得的变异。此类数据使实验室能够在不测序更多病例的情况下更准确地测试临床生物信息学流程的性能。例如,临床实验室可使用计算机模拟数据来模拟低变异等位基因分数的变异,以测试变异检测软件的分析灵敏度,或模拟一系列插入/缺失大小来确定插入/缺失检测软件的性能。在本文中,工作组回顾了不同类型的计算机模拟数据及其优缺点、生成计算机模拟数据的方法,以及这些数据如何在临床分子诊断实验室中使用。调查数据表明了计算机模拟NGS数据目前的使用方式。最后,介绍了计算机模拟数据未来可能有用的潜在应用。