Suppr超能文献

一种结合基因集和单基因的方法,用于利用基因表达数据预测生存风险。

A hybrid approach of gene sets and single genes for the prediction of survival risks with gene expression data.

作者信息

Seok Junhee, Davis Ronald W, Xiao Wenzhong

机构信息

School of Electrical Engineering, Korea University, Seoul 136-713, Republic of Korea.

Stanford Genome Technology Center, Palo Alto, California, United States of America.

出版信息

PLoS One. 2015 May 1;10(5):e0122103. doi: 10.1371/journal.pone.0122103. eCollection 2015.

Abstract

Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn't been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge.

摘要

积累的生物学知识通常被编码为基因集,即与相似生物学功能或通路相关的基因集合。基因集在高通量基因表达数据分析中的应用已得到深入研究,并应用于临床研究。然而,主要兴趣仍在于寻找与疾病状况显著相关的生物学知识模块或相应的基因集。利用基因集从截尾生存时间进行风险预测尚未得到充分研究。在这项工作中,我们提出了一种混合方法,该方法同时使用单个基因和基因集信息来从基因表达谱预测患者生存风险。在所提出的方法中,基因集提供了单个基因难以反映的背景水平信息。作为补充,由于我们的生物医学知识不完善,单个基因有助于补充基因集不完整的信息。通过对多个癌症和创伤损伤数据集的测试,与仅使用单个基因或仅使用基因集的传统方法相比,所提出的方法表现出稳健且改进的性能。此外,我们检查了创伤损伤数据中的预测结果,并表明所提出的方法在预测中使用的生物学知识模块在生物学上具有高度可解释性。临床基因组学中广泛的生存预测问题有望受益于生物学知识的应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0886/4416884/37966b33da6d/pone.0122103.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验