• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

TabDEG:基于特征提取和深度学习框架的 RNA-seq 数据差异表达基因分类。

TabDEG: Classifying differentially expressed genes from RNA-seq data based on feature extraction and deep learning framework.

机构信息

School of Mathematics and Statistics, Guangdong University of Technology, Guangzhou, Guangdong, China.

出版信息

PLoS One. 2024 Jul 22;19(7):e0305857. doi: 10.1371/journal.pone.0305857. eCollection 2024.

DOI:10.1371/journal.pone.0305857
PMID:39037985
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11262683/
Abstract

Traditional differential expression genes (DEGs) identification models have limitations in small sample size datasets because they require meeting distribution assumptions, otherwise resulting high false positive/negative rates due to sample variation. In contrast, tabular data model based on deep learning (DL) frameworks do not need to consider the data distribution types and sample variation. However, applying DL to RNA-Seq data is still a challenge due to the lack of proper labeling and the small sample size compared to the number of genes. Data augmentation (DA) extracts data features using different methods and procedures, which can significantly increase complementary pseudo-values from limited data without significant additional cost. Based on this, we combine DA and DL framework-based tabular data model, propose a model TabDEG, to predict DEGs and their up-regulation/down-regulation directions from gene expression data obtained from the Cancer Genome Atlas database. Compared to five counterpart methods, TabDEG has high sensitivity and low misclassification rates. Experiment shows that TabDEG is robust and effective in enhancing data features to facilitate classification of high-dimensional small sample size datasets and validates that TabDEG-predicted DEGs are mapped to important gene ontology terms and pathways associated with cancer.

摘要

传统的差异表达基因(DEGs)鉴定模型在小样本数据集方面存在局限性,因为它们需要满足分布假设,否则由于样本变化会导致高假阳性/阴性率。相比之下,基于深度学习(DL)框架的表格数据模型不需要考虑数据分布类型和样本变化。然而,由于缺乏适当的标记和与基因数量相比样本量较小,将 DL 应用于 RNA-Seq 数据仍然是一个挑战。数据增强(DA)使用不同的方法和程序提取数据特征,这可以在不显著增加额外成本的情况下,从有限的数据中显著增加互补的伪值。基于此,我们结合了 DA 和基于 DL 框架的表格数据模型,提出了一种模型 TabDEG,用于从癌症基因组图谱数据库中获得的基因表达数据中预测 DEGs 及其上调/下调方向。与五个对照方法相比,TabDEG 具有较高的灵敏度和较低的错误分类率。实验表明,TabDEG 增强数据特征的能力稳健且有效,有助于对高维小样本数据集进行分类,并验证了 TabDEG 预测的 DEGs 映射到与癌症相关的重要基因本体术语和途径。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/099bc37d7864/pone.0305857.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/967129daf58d/pone.0305857.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/f8d46ae0a33d/pone.0305857.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/92d5ea270855/pone.0305857.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/af4195fd0ee0/pone.0305857.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/6f1841acc3e9/pone.0305857.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/45a018020094/pone.0305857.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/099bc37d7864/pone.0305857.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/967129daf58d/pone.0305857.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/f8d46ae0a33d/pone.0305857.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/92d5ea270855/pone.0305857.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/af4195fd0ee0/pone.0305857.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/6f1841acc3e9/pone.0305857.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/45a018020094/pone.0305857.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6023/11262683/099bc37d7864/pone.0305857.g007.jpg

相似文献

1
TabDEG: Classifying differentially expressed genes from RNA-seq data based on feature extraction and deep learning framework.TabDEG:基于特征提取和深度学习框架的 RNA-seq 数据差异表达基因分类。
PLoS One. 2024 Jul 22;19(7):e0305857. doi: 10.1371/journal.pone.0305857. eCollection 2024.
2
DEGnext: classification of differentially expressed genes from RNA-seq data using a convolutional neural network with transfer learning.DEGnext:使用具有迁移学习的卷积神经网络对 RNA-seq 数据进行差异表达基因分类。
BMC Bioinformatics. 2022 Jan 6;23(1):17. doi: 10.1186/s12859-021-04527-4.
3
A Linear Regression and Deep Learning Approach for Detecting Reliable Genetic Alterations in Cancer Using DNA Methylation and Gene Expression Data.基于 DNA 甲基化和基因表达数据的线性回归和深度学习方法在癌症中检测可靠的遗传改变。
Genes (Basel). 2020 Aug 12;11(8):931. doi: 10.3390/genes11080931.
4
A deep learning model to predict RNA-Seq expression of tumours from whole slide images.从全切片图像预测肿瘤 RNA-Seq 表达的深度学习模型。
Nat Commun. 2020 Aug 3;11(1):3877. doi: 10.1038/s41467-020-17678-4.
5
Robust identification of differentially expressed genes from RNA-seq data.从 RNA-seq 数据中稳健地识别差异表达基因。
Genomics. 2020 Mar;112(2):2000-2010. doi: 10.1016/j.ygeno.2019.11.012. Epub 2019 Nov 20.
6
RNA-seq assistant: machine learning based methods to identify more transcriptional regulated genes.RNA-seq 辅助工具:基于机器学习的方法,以鉴定更多受转录调控的基因。
BMC Genomics. 2018 Jul 20;19(1):546. doi: 10.1186/s12864-018-4932-2.
7
Deep learning of gene relationships from single cell time-course expression data.从单细胞时间序列表达数据中深度学习基因关系。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab142.
8
Deep learning-based cancer survival prognosis from RNA-seq data: approaches and evaluations.基于深度学习的 RNA-seq 数据癌症生存预后:方法与评估。
BMC Med Genomics. 2020 Apr 3;13(Suppl 5):41. doi: 10.1186/s12920-020-0686-1.
9
Deep learning-based model for predicting progression in patients with head and neck squamous cell carcinoma.基于深度学习的头颈部鳞状细胞癌患者进展预测模型。
Cancer Biomark. 2020;27(1):19-28. doi: 10.3233/CBM-190380.
10
Integration of RNA-Seq data with heterogeneous microarray data for breast cancer profiling.整合RNA测序数据与异质性微阵列数据用于乳腺癌分析。
BMC Bioinformatics. 2017 Nov 21;18(1):506. doi: 10.1186/s12859-017-1925-0.

本文引用的文献

1
Regulated cell death (RCD) in cancer: key pathways and targeted therapies.癌症中的调控细胞死亡(RCD):关键途径和靶向治疗。
Signal Transduct Target Ther. 2022 Aug 13;7(1):286. doi: 10.1038/s41392-022-01110-y.
2
Comparison of methods and resources for cell-cell communication inference from single-cell RNA-Seq data.单细胞 RNA-Seq 数据中细胞间通讯推断方法和资源的比较。
Nat Commun. 2022 Jun 9;13(1):3224. doi: 10.1038/s41467-022-30755-0.
3
Persistence of mature dendritic cells, T2A, and Tc2 cells characterize clinically resolved atopic dermatitis under IL-4Rα blockade.
成熟树突状细胞、T2A和Tc2细胞的持续存在是白细胞介素-4受体α阻断下临床缓解的特应性皮炎的特征。
Sci Immunol. 2021 Jan 22;6(55). doi: 10.1126/sciimmunol.abe2749.
4
Roles of IFN-γ in tumor progression and regression: a review.γ干扰素在肿瘤进展与消退中的作用:综述
Biomark Res. 2020 Sep 29;8:49. doi: 10.1186/s40364-020-00228-x. eCollection 2020.
5
SINC: a scale-invariant deep-neural-network classifier for bulk and single-cell RNA-seq data.SINC:一种用于批量和单细胞 RNA-seq 数据的尺度不变深度神经网络分类器。
Bioinformatics. 2020 Mar 1;36(6):1779-1784. doi: 10.1093/bioinformatics/btz801.
6
DeepInsight: A methodology to transform a non-image data to an image for convolution neural network architecture.深视:一种将非图像数据转换为卷积神经网络架构图像的方法。
Sci Rep. 2019 Aug 6;9(1):11399. doi: 10.1038/s41598-019-47765-6.
7
RNA sequencing: the teenage years.RNA 测序:青少年时期。
Nat Rev Genet. 2019 Nov;20(11):631-656. doi: 10.1038/s41576-019-0150-2. Epub 2019 Jul 24.
8
Identification of key candidate genes and pathways in multiple myeloma by integrated bioinformatics analysis.通过整合生物信息学分析鉴定多发性骨髓瘤的关键候选基因和通路。
J Cell Physiol. 2019 Dec;234(12):23785-23797. doi: 10.1002/jcp.28947. Epub 2019 Jun 18.
9
Long non-coding RNA LINC00968 reduces cell proliferation and migration and angiogenesis in breast cancer through up-regulation of PROX1 by reducing hsa-miR-423-5p.长非编码 RNA LINC00968 通过降低 hsa-miR-423-5p 水平而上调 PROX1 抑制乳腺癌细胞增殖、迁移和血管生成。
Cell Cycle. 2019 Aug;18(16):1908-1924. doi: 10.1080/15384101.2019.1632641. Epub 2019 Jun 29.
10
The Prospective Value of Dopamine Receptors on Bio-Behavior of Tumor.多巴胺受体对肿瘤生物行为的前瞻性价值
J Cancer. 2019 Mar 3;10(7):1622-1632. doi: 10.7150/jca.27780. eCollection 2019.