Suppr超能文献

多轮混凝土自动编码器识别 12 种癌症的预后 lncRNAs。

Multi-Run Concrete Autoencoder to Identify Prognostic lncRNAs for 12 Cancers.

机构信息

Knight Foundation School of Computing and Information Sciences, Florida International University, Miami, FL 33199, USA.

Department of Human and Molecular Genetics, Herbert Wertheim College of Medicine, Florida International University, Miami, FL 33199, USA.

出版信息

Int J Mol Sci. 2021 Nov 3;22(21):11919. doi: 10.3390/ijms222111919.

Abstract

BACKGROUND

Long non-coding RNA plays a vital role in changing the expression profiles of various target genes that lead to cancer development. Thus, identifying prognostic lncRNAs related to different cancers might help in developing cancer therapy.

METHOD

To discover the critical lncRNAs that can identify the origin of different cancers, we propose the use of the state-of-the-art deep learning algorithm concrete autoencoder (CAE) in an unsupervised setting, which efficiently identifies a subset of the most informative features. However, CAE does not identify reproducible features in different runs due to its stochastic nature. We thus propose a multi-run CAE (mrCAE) to identify a stable set of features to address this issue. The assumption is that a feature appearing in multiple runs carries more meaningful information about the data under consideration. The genome-wide lncRNA expression profiles of 12 different types of cancers, with a total of 4768 samples available in The Cancer Genome Atlas (TCGA), were analyzed to discover the key lncRNAs. The lncRNAs identified by multiple runs of CAE were added to a final list of key lncRNAs that are capable of identifying 12 different cancers.

RESULTS

Our results showed that mrCAE performs better in feature selection than single-run CAE, standard autoencoder (AE), and other state-of-the-art feature selection techniques. This study revealed a set of top-ranking 128 lncRNAs that could identify the origin of 12 different cancers with an accuracy of 95%. Survival analysis showed that 76 of 128 lncRNAs have the prognostic capability to differentiate high- and low-risk groups of patients with different cancers.

CONCLUSION

The proposed mrCAE, which selects actual features, outperformed the AE even though it selects the latent or pseudo-features. By selecting actual features instead of pseudo-features, mrCAE can be valuable for precision medicine. The identified prognostic lncRNAs can be further studied to develop therapies for different cancers.

摘要

背景

长链非编码 RNA 在改变各种靶基因的表达谱方面发挥着重要作用,这些靶基因导致癌症的发生。因此,鉴定与不同癌症相关的预后 lncRNA 可能有助于开发癌症治疗方法。

方法

为了发现能够识别不同癌症起源的关键 lncRNA,我们提出在无监督环境中使用最先进的深度学习算法 concrete 自动编码器(CAE),该算法可以有效地识别出信息量最大的特征子集。然而,由于 CAE 的随机性,它在不同的运行中无法识别可重复的特征。因此,我们提出了一种多运行 CAE(mrCAE)来识别稳定的特征集来解决这个问题。假设出现在多个运行中的特征携带了关于正在考虑的数据更有意义的信息。我们对 12 种不同类型的癌症的全基因组 lncRNA 表达谱进行了分析,这些数据来自癌症基因组图谱(TCGA),共 4768 个样本,以发现关键的 lncRNA。通过 CAE 的多次运行识别的 lncRNA 被添加到一个能够识别 12 种不同癌症的关键 lncRNA 的最终列表中。

结果

我们的结果表明,mrCAE 在特征选择方面的性能优于单运行 CAE、标准自动编码器(AE)和其他最先进的特征选择技术。这项研究揭示了一组排名靠前的 128 个 lncRNA,它们能够以 95%的准确率识别 12 种不同癌症的起源。生存分析表明,在 128 个 lncRNA 中有 76 个具有预测不同癌症患者高风险和低风险组的能力。

结论

提出的 mrCAE 选择实际特征,即使它选择潜在或伪特征,也优于 AE。通过选择实际特征而不是伪特征,mrCAE 可以为精准医疗提供有价值的信息。鉴定出的预后 lncRNA 可以进一步研究,为不同癌症的治疗方法开发提供依据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcf6/8584911/8e721143e7a8/ijms-22-11919-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验