Suppr超能文献

从头变异调用鉴定了 1000 基因组计划中的癌症突变特征。

de novo variant calling identifies cancer mutation signatures in the 1000 Genomes Project.

机构信息

Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, USA.

NVIDIA Corporation, Santa Clara, California, USA.

出版信息

Hum Mutat. 2022 Dec;43(12):1979-1993. doi: 10.1002/humu.24455. Epub 2022 Sep 10.

Abstract

Detection of de novo variants (DNVs) is critical for studies of disease-related variation and mutation rates. To accelerate DNV calling, we developed a graphics processing units-based workflow. We applied our workflow to whole-genome sequencing data from three parent-child sequenced cohorts including the Simons Simplex Collection (SSC), Simons Foundation Powering Autism Research (SPARK), and the 1000 Genomes Project (1000G) that were sequenced using DNA from blood, saliva, and lymphoblastoid cell lines (LCLs), respectively. The SSC and SPARK DNV callsets were within expectations for number of DNVs, percent at CpG sites, phasing to the paternal chromosome of origin, and average allele balance. However, the 1000G DNV callset was not within expectations and contained excessive DNVs that are likely cell line artifacts. Mutation signature analysis revealed 30% of 1000G DNV signatures matched B-cell lymphoma. Furthermore, we found variants in DNA repair genes and at Clinvar pathogenic or likely-pathogenic sites and significant excess of protein-coding DNVs in IGLL5; a gene known to be involved in B-cell lymphomas. Our study provides a new rapid DNV caller for the field and elucidates important implications of using sequencing data from LCLs for reference building and disease-related projects.

摘要

检测新出现的变异(DNV)对于研究与疾病相关的变异和突变率至关重要。为了加速 DNV 调用,我们开发了一个基于图形处理单元的工作流程。我们将我们的工作流程应用于来自三个亲子测序队列的全基因组测序数据,包括西蒙斯单倍型收集(SSC)、西蒙斯基金会自闭症研究动力(SPARK)和 1000 基因组计划(1000G),这些数据分别使用来自血液、唾液和淋巴母细胞系(LCL)的 DNA 进行测序。SSC 和 SPARK DNV 调用集在 DNV 数量、CpG 位点百分比、与父系染色体的相位以及平均等位基因平衡方面符合预期。然而,1000G DNV 调用集不符合预期,包含过多可能是细胞系伪影的 DNV。突变特征分析显示,1000G DNV 特征的 30%与 B 细胞淋巴瘤匹配。此外,我们发现了 DNA 修复基因中的变体,以及 Clinvar 致病性或可能致病性位点,并且在 IGLL5 中发现了大量蛋白质编码 DNV;IGLL5 是已知与 B 细胞淋巴瘤有关的基因。我们的研究为该领域提供了一种新的快速 DNV 调用器,并阐明了使用 LCL 测序数据进行参考构建和疾病相关项目的重要意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/692a/10087346/0db963c53963/HUMU-43-1979-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验