Suppr超能文献

大规模语境嵌入的癌症突变特征表示。

Cancer mutational signatures representation by large-scale context embedding.

机构信息

Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.

Department of Computer Science and Technology, Shanghai Jiao Tong University, Shanghai 200240, China.

出版信息

Bioinformatics. 2020 Jul 1;36(Suppl_1):i309-i316. doi: 10.1093/bioinformatics/btaa433.

Abstract

MOTIVATION

The accumulation of somatic mutations plays critical roles in cancer development and progression. However, the global patterns of somatic mutations, especially non-coding mutations, and their roles in defining molecular subtypes of cancer have not been well characterized due to the computational challenges in analysing the complex mutational patterns.

RESULTS

Here, we develop a new algorithm, called MutSpace, to effectively extract patient-specific mutational features using an embedding framework for larger sequence context. Our method is motivated by the observation that the mutation rate at megabase scale and the local mutational patterns jointly contribute to distinguishing cancer subtypes, both of which can be simultaneously captured by MutSpace. Simulation evaluations show that MutSpace can effectively characterize mutational features from known patient subgroups and achieve superior performance compared with previous methods. As a proof-of-principle, we apply MutSpace to 560 breast cancer patient samples and demonstrate that our method achieves high accuracy in subtype identification. In addition, the learned embeddings from MutSpace reflect intrinsic patterns of breast cancer subtypes and other features of genome structure and function. MutSpace is a promising new framework to better understand cancer heterogeneity based on somatic mutations.

AVAILABILITY AND IMPLEMENTATION

Source code of MutSpace can be accessed at: https://github.com/ma-compbio/MutSpace.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

体细胞突变的积累在癌症的发展和进展中起着关键作用。然而,由于分析复杂突变模式的计算挑战,体细胞突变的全局模式,特别是非编码突变,及其在定义癌症分子亚型中的作用尚未得到很好的描述。

结果

在这里,我们开发了一种新的算法,称为 MutSpace,它使用嵌入框架来有效地提取患者特定的突变特征,用于更大的序列上下文。我们的方法的动机是观察到在兆碱基尺度上的突变率和局部突变模式共同有助于区分癌症亚型,这两者都可以通过 MutSpace 同时捕获。模拟评估表明,MutSpace 可以有效地从已知的患者亚组中描述突变特征,并与以前的方法相比取得了优异的性能。作为一个原理验证,我们将 MutSpace 应用于 560 个乳腺癌患者样本,并证明我们的方法在亚型识别中具有很高的准确性。此外,从 MutSpace 中学习到的嵌入反映了乳腺癌亚型的内在模式以及基因组结构和功能的其他特征。MutSpace 是一个很有前途的新框架,可以基于体细胞突变更好地理解癌症异质性。

可用性和实现

MutSpace 的源代码可以在 https://github.com/ma-compbio/MutSpace 上访问。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5670/7355300/e90a313f28c2/btaa433f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验