• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于体细胞突变的多示例学习用于肿瘤类型分类和微卫星状态预测。

Multiple-instance learning of somatic mutations for the classification of tumour type and the prediction of microsatellite status.

机构信息

Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.

The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.

出版信息

Nat Biomed Eng. 2024 Jan;8(1):57-67. doi: 10.1038/s41551-023-01120-3. Epub 2023 Nov 2.

DOI:10.1038/s41551-023-01120-3
PMID:37919367
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10805698/
Abstract

Large-scale genomic data are well suited to analysis by deep learning algorithms. However, for many genomic datasets, labels are at the level of the sample rather than for individual genomic measures. Machine learning models leveraging these datasets generate predictions by using statically encoded measures that are then aggregated at the sample level. Here we show that a single weakly supervised end-to-end multiple-instance-learning model with multi-headed attention can be trained to encode and aggregate the local sequence context or genomic position of somatic mutations, hence allowing for the modelling of the importance of individual measures for sample-level classification and thus providing enhanced explainability. The model solves synthetic tasks that conventional models fail at, and achieves best-in-class performance for the classification of tumour type and for predicting microsatellite status. By improving the performance of tasks that require aggregate information from genomic datasets, multiple-instance deep learning may generate biological insight.

摘要

大规模基因组数据非常适合深度学习算法进行分析。然而,对于许多基因组数据集,标签是在样本级别,而不是针对单个基因组测量。利用这些数据集的机器学习模型通过使用静态编码的度量值生成预测,然后在样本级别进行聚合。在这里,我们表明,可以训练单个具有多头注意力的弱监督端到端多实例学习模型来对体细胞突变的局部序列上下文或基因组位置进行编码和聚合,从而可以对个体测量值对于样本级分类的重要性进行建模,从而提供增强的可解释性。该模型解决了传统模型无法解决的合成任务,并且在肿瘤类型分类和预测微卫星状态方面实现了同类最佳性能。通过提高需要从基因组数据集中汇总信息的任务的性能,多实例深度学习可能会产生生物学见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/2d9b66def223/41551_2023_1120_Fig8_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/6047f4424c9f/41551_2023_1120_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/c2a4a6b12fc9/41551_2023_1120_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/f4f0d827175b/41551_2023_1120_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/0cdf11bfe39a/41551_2023_1120_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/5a115ff7f0d5/41551_2023_1120_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/c26f795cf0fe/41551_2023_1120_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/57e3e31473a7/41551_2023_1120_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/2d9b66def223/41551_2023_1120_Fig8_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/6047f4424c9f/41551_2023_1120_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/c2a4a6b12fc9/41551_2023_1120_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/f4f0d827175b/41551_2023_1120_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/0cdf11bfe39a/41551_2023_1120_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/5a115ff7f0d5/41551_2023_1120_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/c26f795cf0fe/41551_2023_1120_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/57e3e31473a7/41551_2023_1120_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1094/10805698/2d9b66def223/41551_2023_1120_Fig8_ESM.jpg

相似文献

1
Multiple-instance learning of somatic mutations for the classification of tumour type and the prediction of microsatellite status.基于体细胞突变的多示例学习用于肿瘤类型分类和微卫星状态预测。
Nat Biomed Eng. 2024 Jan;8(1):57-67. doi: 10.1038/s41551-023-01120-3. Epub 2023 Nov 2.
2
Positional encoding-guided transformer-based multiple instance learning for histopathology whole slide images classification.基于位置编码引导的基于Transformer的多实例学习用于组织病理学全切片图像分类。
Comput Methods Programs Biomed. 2025 Jan;258:108491. doi: 10.1016/j.cmpb.2024.108491. Epub 2024 Nov 9.
3
SeLa-MIL: Developing an instance-level classifier via weakly-supervised self-training for whole slide image classification.SeLa-MIL:通过弱监督自训练开发用于全幻灯片图像分类的实例级分类器。
Comput Methods Programs Biomed. 2025 Apr;261:108614. doi: 10.1016/j.cmpb.2025.108614. Epub 2025 Jan 27.
4
Deep semi-supervised multiple instance learning with self-correction for DME classification from OCT images.用于从光学相干断层扫描(OCT)图像中进行糖尿病性黄斑水肿(DME)分类的带自我校正的深度半监督多实例学习
Med Image Anal. 2023 Jan;83:102673. doi: 10.1016/j.media.2022.102673. Epub 2022 Oct 26.
5
Mutation-Attention (MuAt): deep representation learning of somatic mutations for tumour typing and subtyping.突变注意力(MuAt):用于肿瘤分型和亚型分类的体细胞突变的深度表示学习。
Genome Med. 2023 Jul 7;15(1):47. doi: 10.1186/s13073-023-01204-4.
6
Cyclic Learning: Bridging Image-Level Labels and Nuclei Instance Segmentation.循环学习:连接图像级标签和细胞核实例分割。
IEEE Trans Med Imaging. 2023 Oct;42(10):3104-3116. doi: 10.1109/TMI.2023.3275609. Epub 2023 Oct 2.
7
Weakly Semi-supervised phenotyping using Electronic Health records.基于电子健康记录的弱监督表型研究
J Biomed Inform. 2022 Oct;134:104175. doi: 10.1016/j.jbi.2022.104175. Epub 2022 Sep 5.
8
Development and validation of a weakly supervised deep learning framework to predict the status of molecular pathways and key mutations in colorectal cancer from routine histology images: a retrospective study.开发和验证一种弱监督深度学习框架,以从常规组织学图像预测结直肠癌中分子通路和关键突变的状态:一项回顾性研究。
Lancet Digit Health. 2021 Dec;3(12):e763-e772. doi: 10.1016/S2589-7500(21)00180-1. Epub 2021 Oct 19.
9
A clinical text classification paradigm using weak supervision and deep representation.一种使用弱监督和深度表示的临床文本分类范式。
BMC Med Inform Decis Mak. 2019 Jan 7;19(1):1. doi: 10.1186/s12911-018-0723-6.
10
When multiple instance learning meets foundation models: Advancing histological whole slide image analysis.当多实例学习遇上基础模型:推进组织学全切片图像分析
Med Image Anal. 2025 Apr;101:103456. doi: 10.1016/j.media.2025.103456. Epub 2025 Jan 14.

引用本文的文献

1
Integration of Gene Expression and Digital Histology to Predict Treatment-Specific Responses in Breast Cancer.整合基因表达与数字组织学以预测乳腺癌的特定治疗反应
medRxiv. 2025 Aug 27:2025.08.25.25334393. doi: 10.1101/2025.08.25.25334393.
2
Deep Learning for Biomarker Discovery in Cancer Genomes.用于癌症基因组生物标志物发现的深度学习
bioRxiv. 2025 Jan 8:2025.01.06.631471. doi: 10.1101/2025.01.06.631471.
3
Spatial oncology: Translating contextual biology to the clinic.空间肿瘤学:将上下文生物学转化为临床实践。

本文引用的文献

1
Multiple instance learning for digital pathology: A review of the state-of-the-art, limitations & future potential.多实例学习在数字病理学中的应用:综述现状、局限性与未来潜力。
Comput Med Imaging Graph. 2024 Mar;112:102337. doi: 10.1016/j.compmedimag.2024.102337. Epub 2024 Jan 13.
2
Mutation-Attention (MuAt): deep representation learning of somatic mutations for tumour typing and subtyping.突变注意力(MuAt):用于肿瘤分型和亚型分类的体细胞突变的深度表示学习。
Genome Med. 2023 Jul 7;15(1):47. doi: 10.1186/s13073-023-01204-4.
3
Genomics enters the deep learning era.
Cancer Cell. 2024 Oct 14;42(10):1653-1675. doi: 10.1016/j.ccell.2024.09.001. Epub 2024 Oct 3.
4
A Self-Supervised Equivariant Refinement Classification Network for Diabetic Retinopathy Classification.一种用于糖尿病视网膜病变分类的自监督等变细化分类网络。
J Imaging Inform Med. 2025 Jun;38(3):1796-1811. doi: 10.1007/s10278-024-01270-z. Epub 2024 Sep 19.
5
Lynch Syndrome and Somatic Mismatch Repair Variants in Pancreas Cancer.林奇综合征与胰腺癌中体细胞错配修复变异。
JAMA Oncol. 2024 Nov 1;10(11):1511-1518. doi: 10.1001/jamaoncol.2024.3651.
6
A guide to artificial intelligence for cancer researchers.癌症研究人员的人工智能指南。
Nat Rev Cancer. 2024 Jun;24(6):427-441. doi: 10.1038/s41568-024-00694-7. Epub 2024 May 16.
基因组学进入深度学习时代。
PeerJ. 2022 Jun 24;10:e13613. doi: 10.7717/peerj.13613. eCollection 2022.
4
Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration.基于通过数据增强和整合实现的稀疏突变谱的液体活检中的癌症类型分类
Life (Basel). 2021 Dec 21;12(1):1. doi: 10.3390/life12010001.
5
Biologically informed deep neural network for prostate cancer discovery.基于生物学信息的深度神经网络在前列腺癌诊断中的应用
Nature. 2021 Oct;598(7880):348-352. doi: 10.1038/s41586-021-03922-4. Epub 2021 Sep 22.
6
AI-based pathology predicts origins for cancers of unknown primary.基于人工智能的病理学预测癌症未知原发灶的起源。
Nature. 2021 Jun;594(7861):106-110. doi: 10.1038/s41586-021-03512-4. Epub 2021 May 5.
7
Data-efficient and weakly supervised computational pathology on whole-slide images.基于全切片图像的数据高效和弱监督计算病理学。
Nat Biomed Eng. 2021 Jun;5(6):555-570. doi: 10.1038/s41551-020-00682-w. Epub 2021 Mar 1.
8
A multi-resolution model for histopathology image classification and localization with multiple instance learning.基于多实例学习的病理图像分类和定位的多分辨率模型。
Comput Biol Med. 2021 Apr;131:104253. doi: 10.1016/j.compbiomed.2021.104253. Epub 2021 Feb 10.
9
Deep neural network classification based on somatic mutations potentially predicts clinical benefit of immune checkpoint blockade in lung adenocarcinoma.基于体细胞突变的深度神经网络分类可能预测肺腺癌中免疫检查点阻断的临床获益。
Oncoimmunology. 2020 Feb 29;9(1):1734156. doi: 10.1080/2162402X.2020.1734156. eCollection 2020.
10
The repertoire of mutational signatures in human cancer.人类癌症中的突变特征谱。
Nature. 2020 Feb;578(7793):94-101. doi: 10.1038/s41586-020-1943-3. Epub 2020 Feb 5.