Liu Meiling, Liu Yang, Wu Michael C, Hsu Li, He Qianchuan
Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.
Department of Mathematics and Statistics, Wright State University, Dayton, OH 45435, USA.
Bioinformatics. 2021 Apr 9;37(1):50-56. doi: 10.1093/bioinformatics/btaa1090.
Cancer is a highly heterogeneous disease, and virtually all types of cancer have subtypes. Understanding the association between cancer subtypes and genetic variations is fundamental to the development of targeted therapies for patients. Somatic mutation plays important roles in tumor development and has emerged as a new type of genetic variations for studying the association with cancer subtypes. However, the low prevalence of individual mutations poses a tremendous challenge to the related statistical analysis.
In this article, we propose an approach, subtype analysis with somatic mutations (SASOM), for the association analysis of cancer subtypes with somatic mutations. Our approach tests the association between a set of somatic mutations (from a genetic pathway) and subtypes, while incorporating functional information of the mutations into the analysis. We further propose a robust p-value combination procedure, DAPC, to synthesize statistical significance from different sources. Simulation studies show that the proposed approach has correct type I error and tends to be more powerful than possible alternative methods. In a real data application, we examine the somatic mutations from a cutaneous melanoma dataset, and identify a genetic pathway that is associated with immune-related subtypes.
The SASOM R package is available at https://github.com/rksyouyou/SASOM-pkg. R scripts and data are available at https://github.com/rksyouyou/SASOM-analysis.
Supplementary data are available at Bioinformatics online.
癌症是一种高度异质性疾病,几乎所有类型的癌症都有亚型。了解癌症亚型与基因变异之间的关联是为患者开发靶向治疗的基础。体细胞突变在肿瘤发展中起重要作用,并已成为研究与癌症亚型关联的一种新型基因变异。然而,单个突变的低发生率给相关统计分析带来了巨大挑战。
在本文中,我们提出了一种方法,即体细胞突变亚型分析(SASOM),用于癌症亚型与体细胞突变的关联分析。我们的方法测试一组体细胞突变(来自一个基因通路)与亚型之间的关联,同时将突变的功能信息纳入分析。我们进一步提出了一种稳健的p值组合程序DAPC,以综合来自不同来源的统计显著性。模拟研究表明,所提出的方法具有正确的I型错误率,并且比可能的替代方法更具功效。在实际数据应用中,我们检查了一个皮肤黑色素瘤数据集的体细胞突变,并确定了一条与免疫相关亚型相关的基因通路。
SASOM R包可在https://github.com/rksyouyou/SASOM-pkg获取。R脚本和数据可在https://github.com/rksyouyou/SASOM-analysis获取。
补充数据可在《生物信息学》在线获取。