Roy Sarkar Tapasree, Maity Arnab Kumar, Niu Yabo, Mallick Bani K
Department of Biology, Texas A&M University, College Station, TX, USA.
Department of Statistics, Texas A&M University, College Station, TX, USA.
Cancer Inform. 2019 Aug 24;18:1176935119871933. doi: 10.1177/1176935119871933. eCollection 2019.
Long non-coding RNAs (lncRNAs) are a large and diverse class of transcribed RNAs, which have been shown to play a significant role in developing cancer. In this study, we apply integrative modeling framework to integrate the DNA copy number variation (CNV), lncRNA expression, and downstream target protein expression to predict patient survival in breast cancer. We develop a 3-stage model combining a mechanical model (lncRNA regressed on CNV and target proteins regressed on lncRNA) and a clinical model (survival regressed on estimated effects from the mechanical models). Using lncRNAs (such as and ) along with their CNV, target protein expressions, and survival outcomes from The Cancer Genome Atlas (TCGA) database, we show that predicted mean square error and integrated Brier score (IBS) are both lower for the proposed 3-step integrated model than that of 2-step model. Therefore, the integrative model has better predictive ability than the 2-step model not considering target protein information.
长链非编码RNA(lncRNAs)是一类庞大且多样的转录RNA,已被证明在癌症发展中发挥重要作用。在本研究中,我们应用整合建模框架来整合DNA拷贝数变异(CNV)、lncRNA表达和下游靶蛋白表达,以预测乳腺癌患者的生存情况。我们开发了一个三阶段模型,该模型结合了一个力学模型(lncRNA基于CNV进行回归,靶蛋白基于lncRNA进行回归)和一个临床模型(生存基于力学模型的估计效应进行回归)。利用来自癌症基因组图谱(TCGA)数据库中的lncRNAs(如 和 )及其CNV、靶蛋白表达和生存结果,我们表明,所提出的三步整合模型的预测均方误差和综合布里尔评分(IBS)均低于两步模型。因此,与不考虑靶蛋白信息的两步模型相比,整合模型具有更好的预测能力。