Meng Jin, Li Ping, Zhang Qing, Yang Zhangru, Fu Shen
Department of Radiation Oncology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Rd, Shanghai, 200233, China.
Radiation Oncology Center, Fudan University Shanghai Cancer Center (FUSCC), 399 LingLing Rd, Xu Hui District, Shanghai, 200032, China.
J Exp Clin Cancer Res. 2014 Oct 6;33(1):84. doi: 10.1186/s13046-014-0084-7.
Many long non-coding RNAs(lncRNAs) have been found to be a good marker for several tumors. Using lncRNA-mining approach, we aimed to identify lncRNA expression signature that can predict breast cancer patient survival.
We performed LncRNA expression profiling in 887 breast cancer patients from Gene Expression Omnibus (GEO) datasets. The association between lncRNA signature and clinical survival was analyzed using the training set(n = 327, from GSE 20685). The validation for the association was performed in another three independent testing sets(252 from GSE21653, 204 from GSE12276, and 104 from GSE42568).
A set of four lncRNA genes (U79277, AK024118, BC040204, AK000974) have been identified by the random survival forest algorithm. Using a risk score based on the expression signature of these lncRNAs, we separated the patients into low-risk and high-risk groups with significantly different survival times in the training set. This signature was validated in the other three cohorts. Further study revealed that the four-lncRNA expression signature was independent of age and subtype. Gene Set Enrichment Analysis (GSEA) suggested that gene sets were involved in several cancer metastasis related pathways.
These findings indicate that lncRNAs may be implicated in breast cancer pathogenesis. The four-lncRNA signature may have clinical implications in the selection of high-risk patients for adjuvant therapy.
许多长链非编码RNA(lncRNA)已被发现是多种肿瘤的良好标志物。我们旨在通过lncRNA挖掘方法,识别可预测乳腺癌患者生存情况的lncRNA表达特征。
我们对来自基因表达综合数据库(GEO)数据集的887例乳腺癌患者进行了lncRNA表达谱分析。使用训练集(n = 327,来自GSE 20685)分析lncRNA特征与临床生存之间的关联。在另外三个独立测试集(252例来自GSE21653,204例来自GSE12276,104例来自GSE42568)中对该关联进行验证。
通过随机生存森林算法鉴定出一组四个lncRNA基因(U79277、AK024118、BC040204、AK000974)。使用基于这些lncRNA表达特征的风险评分,我们将训练集中的患者分为低风险和高风险组,两组的生存时间有显著差异。该特征在其他三个队列中得到验证。进一步研究表明,四个lncRNA的表达特征与年龄和亚型无关。基因集富集分析(GSEA)表明,基因集参与了多个与癌症转移相关的途径。
这些发现表明lncRNA可能参与乳腺癌的发病机制。四个lncRNA特征可能在选择高危患者进行辅助治疗方面具有临床意义。