• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度学习利用微小RNA表达鉴定原发性不明癌的组织来源:算法开发与验证

Deep Learning-Based Identification of Tissue of Origin for Carcinomas of Unknown Primary Using MicroRNA Expression: Algorithm Development and Validation.

作者信息

Raghu Ananya, Raghu Anisha, Wise Jillian F

机构信息

Quarry Lane School, San Ramon, CA, United States.

Department of Biology and Biomedical Sciences, Salve Regina University, Newport, RI, United States.

出版信息

JMIR Bioinform Biotechnol. 2024 Jul 24;5:e56538. doi: 10.2196/56538.

DOI:10.2196/56538
PMID:39046787
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11306940/
Abstract

BACKGROUND

Carcinoma of unknown primary (CUP) is a subset of metastatic cancers in which the primary tissue source of the cancer cells remains unidentified. CUP is the eighth most common malignancy worldwide, accounting for up to 5% of all malignancies. Representing an exceptionally aggressive metastatic cancer, the median survival is approximately 3 to 6 months. The tissue in which cancer arises plays a key role in our understanding of sensitivities to various forms of cell death. Thus, the lack of knowledge on the tissue of origin (TOO) makes it difficult to devise tailored and effective treatments for patients with CUP. Developing quick and clinically implementable methods to identify the TOO of the primary site is crucial in treating patients with CUP. Noncoding RNAs may hold potential for origin identification and provide a robust route to clinical implementation due to their resistance against chemical degradation.

OBJECTIVE

This study aims to investigate the potential of microRNAs, a subset of noncoding RNAs, as highly accurate biomarkers for detecting the TOO through data-driven, machine learning approaches for metastatic cancers.

METHODS

We used microRNA expression data from The Cancer Genome Atlas data set and assessed various machine learning approaches, from simple classifiers to deep learning approaches. As a test of our classifiers, we evaluated the accuracy on a separate set of 194 primary tumor samples from the Sequence Read Archive. We used permutation feature importance to determine the potential microRNA biomarkers and assessed them with principal component analysis and t-distributed stochastic neighbor embedding visualizations.

RESULTS

Our results show that it is possible to design robust classifiers to detect the TOO for metastatic samples on The Cancer Genome Atlas data set, with an accuracy of up to 97% (351/362), which may be used in situations of CUP. Our findings show that deep learning techniques enhance prediction accuracy. We progressed from an initial accuracy prediction of 62.5% (226/362) with decision trees to 93.2% (337/362) with logistic regression, finally achieving 97% (351/362) accuracy using deep learning on metastatic samples. On the Sequence Read Archive validation set, a lower accuracy of 41.2% (77/188) was achieved by the decision tree, while deep learning achieved a higher accuracy of 80.4% (151/188). Notably, our feature importance analysis showed the top 3 most important features for predicting TOO to be microRNA-10b, microRNA-205, and microRNA-196b, which aligns with previous work.

CONCLUSIONS

Our findings highlight the potential of using machine learning techniques to devise accurate tests for detecting TOO for CUP. Since microRNAs are carried throughout the body via extracellular vesicles secreted from cells, they may serve as key biomarkers for liquid biopsy due to their presence in blood plasma. Our work serves as a foundation toward developing blood-based cancer detection tests based on the presence of microRNA.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b09b/11306940/b3657ee90eae/bioinform_v5i1e56538_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b09b/11306940/6a861c82777d/bioinform_v5i1e56538_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b09b/11306940/07a354cf7a3b/bioinform_v5i1e56538_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b09b/11306940/955dff36cdfa/bioinform_v5i1e56538_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b09b/11306940/3e0e5325fa8c/bioinform_v5i1e56538_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b09b/11306940/b3657ee90eae/bioinform_v5i1e56538_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b09b/11306940/6a861c82777d/bioinform_v5i1e56538_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b09b/11306940/07a354cf7a3b/bioinform_v5i1e56538_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b09b/11306940/955dff36cdfa/bioinform_v5i1e56538_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b09b/11306940/3e0e5325fa8c/bioinform_v5i1e56538_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b09b/11306940/b3657ee90eae/bioinform_v5i1e56538_fig5.jpg
摘要

背景

原发灶不明癌(CUP)是转移性癌症的一个子集,其中癌细胞的原发组织来源仍未明确。CUP是全球第八大常见恶性肿瘤,占所有恶性肿瘤的5%。作为一种极具侵袭性的转移性癌症,其中位生存期约为3至6个月。癌症起源的组织在我们理解对各种形式细胞死亡的敏感性方面起着关键作用。因此,对原发部位组织来源(TOO)缺乏了解使得难以针对CUP患者制定量身定制的有效治疗方案。开发快速且可临床应用的方法来识别原发部位的TOO对于治疗CUP患者至关重要。非编码RNA可能具有起源识别的潜力,并且由于其对化学降解的抗性,为临床应用提供了一条可靠途径。

目的

本研究旨在通过数据驱动的机器学习方法,研究作为非编码RNA子集的微小RNA作为检测转移性癌症TOO的高度准确生物标志物的潜力。

方法

我们使用了来自癌症基因组图谱数据集的微小RNA表达数据,并评估了从简单分类器到深度学习方法的各种机器学习方法。作为对我们分类器的测试,我们在来自序列读取存档的另一组194个原发性肿瘤样本上评估了准确性。我们使用排列特征重要性来确定潜在的微小RNA生物标志物,并通过主成分分析和t分布随机邻域嵌入可视化对其进行评估。

结果

我们的结果表明,有可能设计出强大的分类器来检测癌症基因组图谱数据集上转移性样本的TOO,准确率高达97%(351/362),这可用于CUP的情况。我们的研究结果表明深度学习技术提高了预测准确性。我们从最初使用决策树的62.5%(226/362)准确率预测,到使用逻辑回归的93.2%(337/362)准确率,最终在转移性样本上使用深度学习实现了97%(351/362)的准确率。在序列读取存档验证集上,决策树的准确率较低,为41.2%(77/188),而深度学习的准确率较高,为80.4%(151/188)。值得注意的是,我们的特征重要性分析表明,预测TOO最重要的前3个特征是微小RNA-10b、微小RNA-205和微小RNA-196b,这与先前的研究一致。

结论

我们的研究结果突出了使用机器学习技术设计准确测试以检测CUP患者TOO的潜力。由于微小RNA通过细胞分泌的细胞外囊泡在全身循环,它们可能因其存在于血浆中而成为液体活检的关键生物标志物。我们的工作为基于微小RNA存在开发基于血液的癌症检测测试奠定了基础。

相似文献

1
Deep Learning-Based Identification of Tissue of Origin for Carcinomas of Unknown Primary Using MicroRNA Expression: Algorithm Development and Validation.基于深度学习利用微小RNA表达鉴定原发性不明癌的组织来源:算法开发与验证
JMIR Bioinform Biotechnol. 2024 Jul 24;5:e56538. doi: 10.2196/56538.
2
CUP-AI-Dx: A tool for inferring cancer tissue of origin and molecular subtype using RNA gene-expression data and artificial intelligence.CUP-AI-Dx:一种使用 RNA 基因表达数据和人工智能推断癌症组织来源和分子亚型的工具。
EBioMedicine. 2020 Nov;61:103030. doi: 10.1016/j.ebiom.2020.103030. Epub 2020 Oct 9.
3
Evaluating DNA Methylation, Gene Expression, Somatic Mutation, and Their Combinations in Inferring Tumor Tissue-of-Origin.评估DNA甲基化、基因表达、体细胞突变及其组合在推断肿瘤组织起源中的作用。
Front Cell Dev Biol. 2021 May 3;9:619330. doi: 10.3389/fcell.2021.619330. eCollection 2021.
4
TOD-CUP: a gene expression rank-based majority vote algorithm for tissue origin diagnosis of cancers of unknown primary.TOD-CUP:一种基于基因表达排序的多数投票算法,用于诊断不明原发灶癌症的组织来源。
Brief Bioinform. 2021 Mar 22;22(2):2106-2118. doi: 10.1093/bib/bbaa031.
5
New techniques to identify the tissue of origin for cancer of unknown primary in the era of precision medicine: progress and challenges.精准医学时代识别不明原发癌组织来源的新技术:进展与挑战。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae028.
6
Application of Transcriptome-Based Gene Set Featurization for Machine Learning Model to Predict the Origin of Metastatic Cancer.基于转录组的基因集特征化在机器学习模型预测转移性癌症起源中的应用。
Curr Issues Mol Biol. 2024 Jul 9;46(7):7291-7302. doi: 10.3390/cimb46070432.
7
Image-based deep learning model using DNA methylation data predicts the origin of cancer of unknown primary.基于图像的深度学习模型使用 DNA 甲基化数据预测未知原发性癌症的起源。
Neoplasia. 2024 Sep;55:101021. doi: 10.1016/j.neo.2024.101021. Epub 2024 Jun 28.
8
A machine learning framework to trace tumor tissue-of-origin of 13 types of cancer based on DNA somatic mutation.一种基于 DNA 体细胞突变追踪 13 种癌症肿瘤组织起源的机器学习框架。
Biochim Biophys Acta Mol Basis Dis. 2020 Nov 1;1866(11):165916. doi: 10.1016/j.bbadis.2020.165916. Epub 2020 Aug 7.
9
Accurate classification of metastatic brain tumors using a novel microRNA-based test.使用新型 miRNA 检测方法进行转移性脑肿瘤的准确分类。
Oncologist. 2011;16(2):165-74. doi: 10.1634/theoncologist.2010-0305. Epub 2011 Jan 27.
10
A molecular approach integrating genomic and DNA methylation profiling for tissue of origin identification in lung-specific cancer of unknown primary.一种分子方法,将基因组和 DNA 甲基化分析整合起来,用于鉴定不明原发部位肺特异性癌的组织来源。
J Transl Med. 2022 Apr 5;20(1):158. doi: 10.1186/s12967-022-03362-2.

引用本文的文献

1
OncoTrace-TOO: Interpretable Machine Learning Framework for Cancer Tissue-of-Origin Identification Using Transcriptomic Signatures.OncoTrace-TOO:使用转录组特征识别癌症组织起源的可解释机器学习框架。
Cancer Rep (Hoboken). 2025 Aug;8(8):e70311. doi: 10.1002/cnr2.70311.
2
A Framework for Race-Specific Prostate Cancer Detection Using Machine Learning Through Gene Expression Data: Feature Selection Optimization Approach.一种通过基因表达数据利用机器学习进行种族特异性前列腺癌检测的框架:特征选择优化方法。
JMIR Bioinform Biotechnol. 2025 Jun 20;6. doi: 10.2196/72423.
3
Tumor tissue-of-origin classification using miRNA-mRNA-lncRNA interaction networks and machine learning methods.

本文引用的文献

1
miRNAs: Potential as Biomarkers and Therapeutic Targets for Cancer.miRNAs:癌症的潜在生物标志物和治疗靶点。
Genes (Basel). 2023 Jun 29;14(7):1375. doi: 10.3390/genes14071375.
2
Machine Learning and AI in Cancer Prognosis, Prediction, and Treatment Selection: A Critical Approach.机器学习与人工智能在癌症预后、预测及治疗选择中的应用:批判性探讨
J Multidiscip Healthc. 2023 Jun 26;16:1779-1791. doi: 10.2147/JMDH.S410301. eCollection 2023.
3
Targeting non-coding RNAs to overcome cancer therapy resistance.靶向非编码 RNA 以克服癌症治疗耐药性。
使用miRNA-mRNA-lncRNA相互作用网络和机器学习方法进行肿瘤组织起源分类。
Front Bioinform. 2025 May 6;5:1571476. doi: 10.3389/fbinf.2025.1571476. eCollection 2025.
Signal Transduct Target Ther. 2022 Apr 13;7(1):121. doi: 10.1038/s41392-022-00975-3.
4
miRNAs in Cancer (Review of Literature).miRNAs 在癌症中的作用(文献综述)。
Int J Mol Sci. 2022 Mar 3;23(5):2805. doi: 10.3390/ijms23052805.
5
The Role of microRNAs in Cancer: Functions, Biomarkers and Therapeutics.微小RNA在癌症中的作用:功能、生物标志物与治疗方法
Cancers (Basel). 2022 Feb 10;14(4):872. doi: 10.3390/cancers14040872.
6
Case Report: Tissue Origin Identification for Cancer of Unknown Primary: Gene Expression Profiling Approach.病例报告:不明原发癌的组织起源鉴定:基因表达谱分析方法。
Front Oncol. 2021 Nov 11;11:702887. doi: 10.3389/fonc.2021.702887. eCollection 2021.
7
Deep learning in cancer diagnosis, prognosis and treatment selection.深度学习在癌症诊断、预后和治疗选择中的应用。
Genome Med. 2021 Sep 27;13(1):152. doi: 10.1186/s13073-021-00968-x.
8
MicroRNA expression profiling with a droplet digital PCR assay enables molecular diagnosis and prognosis of cancers of unknown primary.采用液滴数字 PCR 检测技术进行 microRNA 表达谱分析可实现不明原发癌的分子诊断和预后。
Mol Oncol. 2021 Oct;15(10):2732-2751. doi: 10.1002/1878-0261.13026. Epub 2021 Jun 23.
9
Cancer of Unknown Primary: Challenges and Progress in Clinical Management.原发灶不明癌:临床管理中的挑战与进展
Cancers (Basel). 2021 Jan 25;13(3):451. doi: 10.3390/cancers13030451.
10
CUP-AI-Dx: A tool for inferring cancer tissue of origin and molecular subtype using RNA gene-expression data and artificial intelligence.CUP-AI-Dx:一种使用 RNA 基因表达数据和人工智能推断癌症组织来源和分子亚型的工具。
EBioMedicine. 2020 Nov;61:103030. doi: 10.1016/j.ebiom.2020.103030. Epub 2020 Oct 9.