一种用于血液癌症诊断的细胞水平判别神经网络模型。

A cell-level discriminative neural network model for diagnosis of blood cancers.

机构信息

Department of Computer Science, University of California, Irvine, CA 92697, United States.

Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States.

出版信息

Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad585.

DOI:10.1093/bioinformatics/btad585

PMID:37756695

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10563151/

Abstract

MOTIVATION

Precise identification of cancer cells in patient samples is essential for accurate diagnosis and clinical monitoring but has been a significant challenge in machine learning approaches for cancer precision medicine. In most scenarios, training data are only available with disease annotation at the subject or sample level. Traditional approaches separate the classification process into multiple steps that are optimized independently. Recent methods either focus on predicting sample-level diagnosis without identifying individual pathologic cells or are less effective for identifying heterogeneous cancer cell phenotypes.

RESULTS

We developed a generalized end-to-end differentiable model, the Cell Scoring Neural Network (CSNN), which takes sample-level training data and predicts the diagnosis of the testing samples and the identity of the diagnostic cells in the sample, simultaneously. The cell-level density differences between samples are linked to the sample diagnosis, which allows the probabilities of individual cells being diagnostic to be calculated using backpropagation. We applied CSNN to two independent clinical flow cytometry datasets for leukemia diagnosis. In both qualitative and quantitative assessments, CSNN outperformed preexisting neural network modeling approaches for both cancer diagnosis and cell-level classification. Post hoc decision trees and 2D dot plots were generated for interpretation of the identified cancer cells, showing that the identified cell phenotypes match the cancer endotypes observed clinically in patient cohorts. Independent data clustering analysis confirmed the identified cancer cell populations.

AVAILABILITY AND IMPLEMENTATION

The source code of CSNN and datasets used in the experiments are publicly available on GitHub (http://github.com/erobl/csnn). Raw FCS files can be downloaded from FlowRepository (ID: FR-FCM-Z6YK).

摘要

动机

在机器学习方法应用于癌症精准医疗中，精确识别患者样本中的癌细胞对于准确诊断和临床监测至关重要，但这一直是一个重大挑战。在大多数情况下，训练数据仅在主题或样本级别具有疾病注释。传统方法将分类过程分为多个独立优化的步骤。最近的方法要么专注于预测样本级别的诊断，而不识别单个病理细胞，要么对于识别异质的癌症细胞表型效果较差。

结果

我们开发了一种通用的端到端可微分模型，即细胞评分神经网络（CSNN），它可以接受样本级别的训练数据，并同时预测测试样本的诊断结果和样本中诊断细胞的身份。样本之间的细胞级密度差异与样本诊断相关联，这允许使用反向传播计算单个细胞的诊断概率。我们将 CSNN 应用于两个独立的临床流式细胞术数据集，用于白血病诊断。在定性和定量评估中，CSNN 在癌症诊断和细胞级分类方面均优于现有的神经网络建模方法。事后决策树和 2D 点图用于解释鉴定的癌细胞，表明鉴定的癌细胞表型与患者队列中临床观察到的癌症内型相匹配。独立的数据聚类分析证实了鉴定的癌细胞群体。

可用性和实现

CSNN 的源代码和实验中使用的数据集可在 GitHub（http://github.com/erobl/csnn）上公开获取。原始 FCS 文件可从 FlowRepository（ID：FR-FCM-Z6YK）下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08ea/10563151/5154cfaf27bd/btad585f1.jpg

相似文献

A cell-level discriminative neural network model for diagnosis of blood cancers.

Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad585.

A cell-level discriminative neural network model for diagnosis of blood cancers.

medRxiv. 2023 Feb 10:2023.02.07.23285606. doi: 10.1101/2023.02.07.23285606.

A neural network approach to breast cancer diagnosis as a constraint satisfaction problem.

Med Phys. 2001 May;28(5):804-11. doi: 10.1118/1.1367861.

COmic: convolutional kernel networks for interpretable end-to-end learning on (multi-)omics data.

Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i76-i85. doi: 10.1093/bioinformatics/btad204.

Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.

Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.

GateNet: A novel neural network architecture for automated flow cytometry gating.

Comput Biol Med. 2024 Sep;179:108820. doi: 10.1016/j.compbiomed.2024.108820. Epub 2024 Jul 12.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Mapping cell populations in flow cytometry data for cross-sample comparison using the Friedman-Rafsky test statistic as a distance measure.

Cytometry A. 2016 Jan;89(1):71-88. doi: 10.1002/cyto.a.22735. Epub 2015 Aug 14.

Convolutional neural networks for classification of Alzheimer's disease: Overview and reproducible evaluation.

Med Image Anal. 2020 Jul;63:101694. doi: 10.1016/j.media.2020.101694. Epub 2020 May 1.

Content-based image retrieval with a Convolutional Siamese Neural Network: Distinguishing lung cancer and tuberculosis in CT images.

Comput Biol Med. 2022 Jan;140:105096. doi: 10.1016/j.compbiomed.2021.105096. Epub 2021 Nov 30.

引用本文的文献

Deep Learning in Hematology: From Molecules to Patients.

Clin Hematol Int. 2024 Oct 8;6(4):19-42. doi: 10.46989/001c.124131. eCollection 2024.

本文引用的文献

Full spectrum flow cytometry and mass cytometry: A 32-marker panel comparison.

Cytometry A. 2022 Nov;101(11):942-959. doi: 10.1002/cyto.a.24565. Epub 2022 May 20.

Application of Machine Learning for Cytometry Data.

Front Immunol. 2022 Jan 3;12:787574. doi: 10.3389/fimmu.2021.787574. eCollection 2021.

New interpretable machine-learning method for single-cell data reveals correlates of clinical response to cancer immunotherapy.

Patterns (N Y). 2021 Oct 27;2(12):100372. doi: 10.1016/j.patter.2021.100372. eCollection 2021 Dec 10.

A Machine Learning Approach to the Classification of Acute Leukemias and Distinction From Nonneoplastic Cytopenias Using Flow Cytometry Data.

Am J Clin Pathol. 2022 Apr 1;157(4):546-553. doi: 10.1093/ajcp/aqab148.

Automated identification of maximal differential cell populations in flow cytometry data.

Cytometry A. 2022 Feb;101(2):177-184. doi: 10.1002/cyto.a.24503. Epub 2021 Oct 22.

OMIP-069: Forty-Color Full Spectrum Flow Cytometry Panel for Deep Immunophenotyping of Major Cell Subsets in Human Peripheral Blood.

Cytometry A. 2020 Oct;97(10):1044-1051. doi: 10.1002/cyto.a.24213. Epub 2020 Aug 31.

A robust and interpretable end-to-end deep learning model for cytometry data.

Proc Natl Acad Sci U S A. 2020 Sep 1;117(35):21373-21380. doi: 10.1073/pnas.2003026117. Epub 2020 Aug 14.

Machine Learning of Discriminative Gate Locations for Clinical Diagnosis.

Cytometry A. 2020 Mar;97(3):296-307. doi: 10.1002/cyto.a.23906. Epub 2019 Nov 5.

Automated subset identification and characterization pipeline for multidimensional flow and mass cytometry data clustering and visualization.

Commun Biol. 2019 Jun 20;2:229. doi: 10.1038/s42003-019-0467-6. eCollection 2019.

diffcyt: Differential discovery in high-dimensional cytometry via high-resolution clustering.

Commun Biol. 2019 May 14;2:183. doi: 10.1038/s42003-019-0415-5. eCollection 2019.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于血液癌症诊断的细胞水平判别神经网络模型。

A cell-level discriminative neural network model for diagnosis of blood cancers.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

结果

可用性和实现

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献