Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children's Hospital, Boston, MA, USA.
Nat Biotechnol. 2020 Mar;38(3):314-319. doi: 10.1038/s41587-019-0368-8. Epub 2020 Jan 6.
Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning method that leverages read-based phasing and read-level features to accurately detect mosaic single-nucleotide variants and indels, achieving a multifold increase in specificity compared with existing algorithms. Using single-cell sequencing and targeted sequencing, we validated 80-90% of the mosaic single-nucleotide variants and 60-80% of indels detected in human brain whole-genome sequencing data. Our method should help elucidate the contribution of mosaic somatic mutations to the origin and development of disease.
检测正常发育过程中产生的嵌合体突变具有挑战性,因为此类突变通常只存在于极少数细胞中,并且没有明确的匹配对照来去除种系变异和系统伪影。我们提出了 MosaicForecast,这是一种利用基于读取的相位和读取级特征来准确检测嵌合单核苷酸变体和插入缺失的机器学习方法,与现有算法相比,特异性提高了数倍。我们使用单细胞测序和靶向测序,验证了人类全基因组测序数据中检测到的 80-90%的嵌合单核苷酸变体和 60-80%的插入缺失。我们的方法应该有助于阐明嵌合体细胞突变对疾病起源和发展的贡献。