Suppr超能文献

整合突变和基因表达横断面数据以推断癌症进展。

Integrating mutation and gene expression cross-sectional data to infer cancer progression.

作者信息

Fleck Julia L, Pavel Ana B, Cassandras Christos G

机构信息

Division of Systems Engineering, Boston University, 15 Saint Mary's Street, Brookline, MA 02446, USA.

Graduate Program in Bioinformatics, Boston University, 24 Cummington Mall, Boston, MA 02215, USA.

出版信息

BMC Syst Biol. 2016 Jan 25;10:12. doi: 10.1186/s12918-016-0255-6.

Abstract

BACKGROUND

A major problem in identifying the best therapeutic targets for cancer is the molecular heterogeneity of the disease. Cancer is often caused by an accumulation of mutations which produce irreversible damage to the cell's control mechanisms of survival and proliferation. Different mutations may affect these cellular anachronisms through a combination of molecular interactions which may be dynamically changing during cancer progression. It has been previously shown that cancer accumulates mutations over time. In this paper we address the problem of cancer heterogeneity by modeling cancer progression using somatic mutation and gene expression cross-sectional data.

RESULTS

We propose a novel formulation of integrating somatic mutation and gene expression data to infer the temporal sequence of events from cross-sectional data. Using a mixed integer linear program we model the interaction between groups of different mutated genes and the resulting modifications at the gene expression level. Our approach identifies a partition of mutation events which gradually produce gene expression changes to a partition of genes over time. The proposed formulation is tested using both simulated data and real breast cancer data with matched somatic mutations and gene expression measurements from The Cancer Genome Atlas. First, we classify the genes as oncogenes or tumor suppressors based on the frequency of driver mutations. As expected, the most frequently mutated genes in breast cancer are PIK3CA and TP53 genes. Then, we select those genes with most frequent driver mutations and a set of genes known to play roles in cancer development. Furthermore, we apply the proposed mixed integer linear program to identify the temporal order in which genes mutate and, simultaneously, the changes they produce at the gene expression level during cancer progression. In addition, we are able to identify known causal relationships between mutations and gene expression changes in PI3K/AKT and TP53 pathways.

CONCLUSIONS

This paper proposes a new model to infer the temporal sequence in which mutations occur and lead to changes at the gene expression level during cancer progression. The approach is general and can be applied to any data sets with available somatic mutations and gene expression measurements.

摘要

背景

确定癌症最佳治疗靶点的一个主要问题是该疾病的分子异质性。癌症通常由突变积累引起,这些突变会对细胞的生存和增殖控制机制造成不可逆的损害。不同的突变可能通过分子相互作用的组合影响这些细胞异常现象,而这些相互作用在癌症进展过程中可能会动态变化。先前已表明癌症会随着时间积累突变。在本文中,我们通过使用体细胞突变和基因表达横断面数据对癌症进展进行建模,来解决癌症异质性问题。

结果

我们提出了一种整合体细胞突变和基因表达数据的新方法,以从横断面数据推断事件的时间顺序。使用混合整数线性规划,我们对不同突变基因组之间的相互作用以及基因表达水平上产生的修饰进行建模。我们的方法确定了突变事件的一个划分,随着时间的推移,这些事件会逐渐使基因表达发生变化,形成基因的一个划分。使用来自癌症基因组图谱的匹配体细胞突变和基因表达测量数据,对模拟数据和真实乳腺癌数据进行了测试。首先,我们根据驱动突变的频率将基因分类为癌基因或肿瘤抑制基因。不出所料,乳腺癌中最常发生突变的基因是PIK3CA和TP53基因。然后,我们选择那些具有最频繁驱动突变的基因以及一组已知在癌症发展中起作用的基因。此外,我们应用所提出的混合整数线性规划来确定基因发生突变的时间顺序,以及它们在癌症进展过程中在基因表达水平上产生的变化。此外,我们能够识别PI3K/AKT和TP53途径中突变与基因表达变化之间已知的因果关系。

结论

本文提出了一种新模型,用于推断癌症进展过程中突变发生并导致基因表达水平变化的时间顺序。该方法具有通用性,可应用于任何具有可用体细胞突变和基因表达测量数据的数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c365/4727329/bf84e9c9aa62/12918_2016_255_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验