Suppr超能文献

基于体细胞突变的关联分析。

Association analysis using somatic mutations.

机构信息

Department of Mathematics and Statistics, Wright State University, Dayton, Ohio, United States of America.

Biostatistics Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America.

出版信息

PLoS Genet. 2018 Nov 2;14(11):e1007746. doi: 10.1371/journal.pgen.1007746. eCollection 2018 Nov.

Abstract

Somatic mutations drive the growth of tumor cells and are pivotal biomarkers for many cancer treatments. Genetic association analysis using somatic mutations is an effective approach to study the functional impact of somatic mutations. However, standard regression methods are not appropriate for somatic mutation association studies because somatic mutation calls often have non-ignorable false positive rate and/or false negative rate. While large scale association analysis using somatic mutations becomes feasible recently-thanks for the improvement of sequencing techniques and the reduction of sequencing cost-there is an urgent need for a new statistical method designed for somatic mutation association analysis. We propose such a method with computationally efficient software implementation: Somatic mutation Association test with Measurement Errors (SAME). SAME accounts for somatic mutation calling uncertainty using a likelihood based approach. It can be used to assess the associations between continuous/dichotomous outcomes and individual mutations or gene-level mutations. Through simulation studies across a wide range of realistic scenarios, we show that SAME can significantly improve statistical power than the naive generalized linear model that ignores mutation calling uncertainty. Finally, using the data collected from The Cancer Genome Atlas (TCGA) project, we apply SAME to study the associations between somatic mutations and gene expression in 12 cancer types, as well as the associations between somatic mutations and colon cancer subtype defined by DNA methylation data. SAME recovered some interesting findings that were missed by the generalized linear model. In addition, we demonstrated that mutation-level and gene-level analyses are often more appropriate for oncogene and tumor-suppressor gene, respectively.

摘要

体细胞突变驱动肿瘤细胞的生长,是许多癌症治疗的关键生物标志物。使用体细胞突变进行遗传关联分析是研究体细胞突变功能影响的有效方法。然而,标准回归方法并不适用于体细胞突变关联研究,因为体细胞突变检测通常具有不可忽略的假阳性率和/或假阴性率。虽然由于测序技术的改进和测序成本的降低,最近大规模使用体细胞突变进行关联分析变得可行,但迫切需要一种新的专门用于体细胞突变关联分析的统计方法。我们提出了一种具有计算效率的软件实现方法:带有测量误差的体细胞突变关联测试(SAME)。SAME 使用基于似然的方法来解释体细胞突变检测的不确定性。它可用于评估连续/二分类结果与个体突变或基因水平突变之间的关联。通过在广泛的现实场景中进行模拟研究,我们表明 SAME 可以显著提高统计功效,优于忽略突变检测不确定性的简单广义线性模型。最后,我们使用从癌症基因组图谱(TCGA)项目收集的数据,应用 SAME 研究了 12 种癌症类型中体细胞突变与基因表达之间的关联,以及体细胞突变与基于 DNA 甲基化数据定义的结肠癌亚型之间的关联。SAME 发现了一些被广义线性模型遗漏的有趣发现。此外,我们还证明了突变水平和基因水平分析分别更适合癌基因和肿瘤抑制基因。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验