文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

使用基因表达和深度学习以及 KL 散度基因选择预测肺癌。

Prediction of lung cancer using gene expression and deep learning with KL divergence gene selection.

机构信息

College of Public Health, Zhengzhou University, Zhengzhou, 450001, China.

出版信息

BMC Bioinformatics. 2022 May 12;23(1):175. doi: 10.1186/s12859-022-04689-9.


DOI:10.1186/s12859-022-04689-9
PMID:35549644
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9103042/
Abstract

BACKGROUND: Lung cancer is one of the cancers with the highest mortality rate in China. With the rapid development of high-throughput sequencing technology and the research and application of deep learning methods in recent years, deep neural networks based on gene expression have become a hot research direction in lung cancer diagnosis in recent years, which provide an effective way of early diagnosis for lung cancer. Thus, building a deep neural network model is of great significance for the early diagnosis of lung cancer. However, the main challenges in mining gene expression datasets are the curse of dimensionality and imbalanced data. The existing methods proposed by some researchers can't address the problems of high-dimensionality and imbalanced data, because of the overwhelming number of variables measured (genes) versus the small number of samples, which result in poor performance in early diagnosis for lung cancer. METHOD: Given the disadvantages of gene expression data sets with small datasets, high-dimensionality and imbalanced data, this paper proposes a gene selection method based on KL divergence, which selects some genes with higher KL divergence as model features. Then build a deep neural network model using Focal Loss as loss function, at the same time, we use k-fold cross validation method to verify and select the best model, we set the value of k is five in this paper. RESULT: The deep learning model method based on KL divergence gene selection proposed in this paper has an AUC of 0.99 on the validation set. The generalization performance of model is high. CONCLUSION: The deep neural network model based on KL divergence gene selection proposed in this paper is proved to be an accurate and effective method for lung cancer prediction.

摘要

背景:肺癌是中国死亡率最高的癌症之一。随着高通量测序技术的快速发展和近年来深度学习方法的研究与应用,基于基因表达的深度神经网络已成为近年来肺癌诊断的一个热门研究方向,为肺癌的早期诊断提供了有效的方法。因此,构建深度神经网络模型对于肺癌的早期诊断具有重要意义。然而,挖掘基因表达数据集的主要挑战是维数灾难和数据不平衡。一些研究人员提出的现有方法不能解决高维数据和不平衡数据的问题,因为所测量的变量(基因)数量与样本数量相比过于庞大,从而导致肺癌早期诊断的性能较差。

方法:鉴于数据集小、高维数据和数据不平衡的缺点,本文提出了一种基于 KL 散度的基因选择方法,该方法选择一些具有较高 KL 散度的基因作为模型特征。然后使用焦点损失作为损失函数构建深度神经网络模型,同时,我们使用 k 折交叉验证方法进行验证和选择最佳模型,在本文中我们设置 k 的值为 5。

结果:本文提出的基于 KL 散度基因选择的深度学习模型方法在验证集上的 AUC 为 0.99。模型的泛化性能较高。

结论:本文提出的基于 KL 散度基因选择的深度神经网络模型被证明是一种准确有效的肺癌预测方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6598/9103042/d573c730c285/12859_2022_4689_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6598/9103042/c11006f57c92/12859_2022_4689_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6598/9103042/0a0212e9af4f/12859_2022_4689_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6598/9103042/60d5456a0297/12859_2022_4689_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6598/9103042/d573c730c285/12859_2022_4689_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6598/9103042/c11006f57c92/12859_2022_4689_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6598/9103042/0a0212e9af4f/12859_2022_4689_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6598/9103042/60d5456a0297/12859_2022_4689_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6598/9103042/d573c730c285/12859_2022_4689_Fig4_HTML.jpg

相似文献

[1]
Prediction of lung cancer using gene expression and deep learning with KL divergence gene selection.

BMC Bioinformatics. 2022-5-12

[2]
Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.

Med Phys. 2019-4-8

[3]
Network-based drug sensitivity prediction.

BMC Med Genomics. 2020-12-28

[4]
Optimizing neural networks for medical data sets: A case study on neonatal apnea prediction.

Artif Intell Med. 2019-7-25

[5]
A semi-supervised deep learning method based on stacked sparse auto-encoder for cancer prediction using RNA-seq data.

Comput Methods Programs Biomed. 2018-10-5

[6]
A survey on gene expression data analysis using deep learning methods for cancer diagnosis.

Prog Biophys Mol Biol. 2023-1

[7]
A deep learning-based multi-model ensemble method for cancer prediction.

Comput Methods Programs Biomed. 2017-9-14

[8]
Development and Validation of a Deep Learning Model for Non-Small Cell Lung Cancer Survival.

JAMA Netw Open. 2020-6-1

[9]
Performance Analysis of Deep Learning Models for Binary Classification of Cancer Gene Expression Data.

J Healthc Eng. 2022

[10]
Deep-Learning-Based Cancer Profiles Classification Using Gene Expression Data Profile.

J Healthc Eng. 2022

引用本文的文献

[1]
Margin weighted robust discriminant score for feature selection in imbalanced gene expression classification.

PLoS One. 2025-6-10

[2]
High-precision lung cancer subtype diagnosis on imbalanced exosomal data via Exo-LCClassifier.

Front Genet. 2025-4-30

[3]
Classification of lung cancer severity using gene expression data based on deep learning.

BMC Med Inform Decis Mak. 2025-5-14

[4]
Breast Cancer Detection Using Convolutional Neural Networks: A Deep Learning-Based Approach.

Cureus. 2025-5-3

[5]
Genetic feature selection algorithm as an efficient glioma grade classifier.

Sci Rep. 2025-5-3

[6]
Lung Cancer Biomarker Database (LCBD): a comprehensive and curated repository of lung cancer biomarkers.

BMC Cancer. 2025-3-15

[7]
Cancer genetics and deep learning applications for diagnosis, prognosis, and categorization.

J Biol Methods. 2024-8-9

[8]
Research in the application of artificial intelligence to lung cancer diagnosis.

Front Med (Lausanne). 2024-1-30

[9]
Diagnostic Accuracy of Machine Learning AI Architectures in Detection and Classification of Lung Cancer: A Systematic Review.

Diagnostics (Basel). 2023-6-22

[10]
Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review.

Bioengineering (Basel). 2023-1-28

本文引用的文献

[1]
Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries.

CA Cancer J Clin. 2021-5

[2]
The Application of Deep Learning in Cancer Prognosis Prediction.

Cancers (Basel). 2020-3-5

[3]
The Biology of Lung Cancer: Development of More Effective Methods for Prevention, Diagnosis, and Treatment.

Clin Chest Med. 2020-3

[4]
Using Supervised Learning Methods for Gene Selection in RNA-Seq Case-Control Studies.

Front Genet. 2018-8-3

[5]
Lung cancer prediction using machine learning and advanced imaging techniques.

Transl Lung Cancer Res. 2018-6

[6]
Focal Loss for Dense Object Detection.

IEEE Trans Pattern Anal Mach Intell. 2020-2

[7]
Applications of Support Vector Machine (SVM) Learning in Cancer Genomics.

Cancer Genomics Proteomics. 2018

[8]
A deep learning-based multi-model ensemble method for cancer prediction.

Comput Methods Programs Biomed. 2017-9-14

[9]
Deep learning in neural networks: an overview.

Neural Netw. 2015-1

[10]
Hallmarks of cancer: the next generation.

Cell. 2011-3-4

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索