Suppr超能文献

基于体细胞突变的多示例学习用于肿瘤类型分类和微卫星状态预测。

Multiple-instance learning of somatic mutations for the classification of tumour type and the prediction of microsatellite status.

机构信息

Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.

The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.

出版信息

Nat Biomed Eng. 2024 Jan;8(1):57-67. doi: 10.1038/s41551-023-01120-3. Epub 2023 Nov 2.

Abstract

Large-scale genomic data are well suited to analysis by deep learning algorithms. However, for many genomic datasets, labels are at the level of the sample rather than for individual genomic measures. Machine learning models leveraging these datasets generate predictions by using statically encoded measures that are then aggregated at the sample level. Here we show that a single weakly supervised end-to-end multiple-instance-learning model with multi-headed attention can be trained to encode and aggregate the local sequence context or genomic position of somatic mutations, hence allowing for the modelling of the importance of individual measures for sample-level classification and thus providing enhanced explainability. The model solves synthetic tasks that conventional models fail at, and achieves best-in-class performance for the classification of tumour type and for predicting microsatellite status. By improving the performance of tasks that require aggregate information from genomic datasets, multiple-instance deep learning may generate biological insight.

摘要

大规模基因组数据非常适合深度学习算法进行分析。然而,对于许多基因组数据集,标签是在样本级别,而不是针对单个基因组测量。利用这些数据集的机器学习模型通过使用静态编码的度量值生成预测,然后在样本级别进行聚合。在这里,我们表明,可以训练单个具有多头注意力的弱监督端到端多实例学习模型来对体细胞突变的局部序列上下文或基因组位置进行编码和聚合,从而可以对个体测量值对于样本级分类的重要性进行建模,从而提供增强的可解释性。该模型解决了传统模型无法解决的合成任务,并且在肿瘤类型分类和预测微卫星状态方面实现了同类最佳性能。通过提高需要从基因组数据集中汇总信息的任务的性能,多实例深度学习可能会产生生物学见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/6047f4424c9f/41551_2023_1120_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验