Xu Chaohan, Ping Yanyan, Zhao Hongying, Ning Shangwei, Xia Peng, Wang Weida, Wan Linyun, Li Jie, Zhang Li, Yu Lei, Xiao Yun
College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.
Key Laboratory of Cardiovascular Medicine Research, Harbin Medical University, Ministry of Education, Harbin, China.
Oncotarget. 2017 Dec 8;8(70):114603-114612. doi: 10.18632/oncotarget.23059. eCollection 2017 Dec 29.
Our knowledge of lncRNA is very limited and discovering novel disease-related long non-coding RNA (lncRNA) has been a major research challenge in cancer studies. In this work, we developed an LncRNA Network-based Prioritization approach, named "LncNetP" based on the competing endogenous RNA (ceRNA) and disease phenotype association assumptions. Through application to 11 cancer types with 3089 common lncRNA and miRNA samples from the Cancer Genome Atlas (TCGA), our approach yielded an average area under the ROC curve (AUC) of 83.87%, with the highest AUC (95.22%) for renal cell carcinoma, by the leave-one-out cross validation strategy. Moreover, we demonstrated the excellent performance of our approach by evaluating the influencing factors including disease phenotype associations, known disease lncRNAs and the numbers of cancer types. Comparisons with previous methods further suggested the integrative importance of our approach. Taking hepatocellular carcinoma (LIHC) as a case study, we predicted four candidate lncRNA genes, RHPN1-AS1, AC007389.1, LINC01116 and BMS1P20 that may serve as novel disease risk factors for disease diagnosis and prognosis. In summary, our lncRNA prioritization strategy can efficiently identify disease-related lncRNAs and help researchers better understand the important roles of lncRNAs in human cancers.
我们对长链非编码RNA(lncRNA)的了解非常有限,发现与疾病相关的新型长链非编码RNA一直是癌症研究中的一项重大挑战。在这项工作中,我们基于竞争性内源RNA(ceRNA)和疾病表型关联假设,开发了一种基于lncRNA网络的优先级排序方法,名为“LncNetP”。通过将其应用于来自癌症基因组图谱(TCGA)的11种癌症类型的3089个常见lncRNA和miRNA样本,我们的方法在留一法交叉验证策略下,ROC曲线下面积(AUC)的平均值为83.87%,其中肾细胞癌的AUC最高(95.22%)。此外,我们通过评估疾病表型关联、已知疾病lncRNA和癌症类型数量等影响因素,证明了我们方法的优异性能。与先前方法的比较进一步表明了我们方法的综合重要性。以肝细胞癌(LIHC)为例,我们预测了四个候选lncRNA基因,即RHPN1-AS1、AC007389.1、LINC01116和BMS1P20,它们可能作为疾病诊断和预后的新型疾病风险因素。总之,我们的lncRNA优先级排序策略可以有效地识别与疾病相关的lncRNA,并帮助研究人员更好地理解lncRNA在人类癌症中的重要作用。