TNMplot.com:一个用于比较正常、肿瘤和转移组织中基因表达的网络工具。

TNMplot.com: A Web Tool for the Comparison of Gene Expression in Normal, Tumor and Metastatic Tissues.

机构信息

Department of Bioinformatics, Semmelweis University, 1094 Budapest, Hungary.

Momentum Cancer Biomarker Research Group, Research Centre for Natural Sciences, 1117 Budapest, Hungary.

出版信息

Int J Mol Sci. 2021 Mar 5;22(5):2622. doi: 10.3390/ijms22052622.

Abstract

Genes showing higher expression in either tumor or metastatic tissues can help in better understanding tumor formation and can serve as biomarkers of progression or as potential therapy targets. Our goal was to establish an integrated database using available transcriptome-level datasets and to create a web platform which enables the mining of this database by comparing normal, tumor and metastatic data across all genes in real time. We utilized data generated by either gene arrays from the Gene Expression Omnibus of the National Center for Biotechnology Information (NCBI-GEO) or RNA-seq from The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and The Genotype-Tissue Expression (GTEx) repositories. The altered expression within different platforms was analyzed separately. Statistical significance was computed using Mann-Whitney or Kruskal-Wallis tests. False Discovery Rate (FDR) was computed using the Benjamini-Hochberg method. The entire database contains 56,938 samples, including 33,520 samples from 3180 gene chip-based studies (453 metastatic, 29,376 tumorous and 3691 normal samples), 11,010 samples from TCGA (394 metastatic, 9886 tumorous and 730 normal), 1193 samples from TARGET (1 metastatic, 1180 tumorous and 12 normal) and 11,215 normal samples from GTEx. The most consistently upregulated genes across multiple tumor types were TOP2A (FC = 7.8), SPP1 (FC = 7.0) and CENPA (FC = 6.03), and the most consistently downregulated gene was ADH1B (FC = 0.15). Validation of differential expression using equally sized training and test sets confirmed the reliability of the database in breast, colon, and lung cancer at an FDR below 10%. The online analysis platform enables unrestricted mining of the database and is accessible at TNMplot.com.

摘要

在肿瘤或转移组织中表达较高的基因有助于更好地了解肿瘤的形成,并可作为进展的生物标志物或作为潜在的治疗靶点。我们的目标是利用现有的转录组数据集建立一个综合数据库,并创建一个网络平台,使人们能够通过实时比较所有基因的正常、肿瘤和转移数据来挖掘该数据库。我们利用了来自国家生物技术信息中心(NCBI-GEO)基因表达综合数据库的基因芯片或来自癌症基因组图谱(TCGA)、治疗性适用研究以产生有效治疗方法(TARGET)和基因型-组织表达(GTEx)的 RNA-seq 数据。不同平台之间的差异表达分别进行了分析。使用曼-惠特尼或克鲁斯卡尔-沃利斯检验计算统计学显著性。使用本杰明-霍克伯格方法计算错误发现率(FDR)。整个数据库包含 56938 个样本,包括 3180 个基于基因芯片的研究中的 33520 个样本(453 个转移性、29376 个肿瘤性和 3691 个正常样本)、来自 TCGA 的 11010 个样本(394 个转移性、9886 个肿瘤性和 730 个正常样本)、来自 TARGET 的 1193 个样本(1 个转移性、1180 个肿瘤性和 12 个正常样本)和来自 GTEx 的 11215 个正常样本。在多种肿瘤类型中最一致上调的基因是 TOP2A(FC=7.8)、SPP1(FC=7.0)和 CENPA(FC=6.03),最一致下调的基因是 ADH1B(FC=0.15)。使用同等大小的训练和测试集验证差异表达的可靠性,在 FDR 低于 10%的情况下,在乳腺癌、结肠癌和肺癌中证实了该数据库的可靠性。在线分析平台允许对数据库进行无限制的挖掘,可在 TNMplot.com 上访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a932/7961455/3a93d73934ef/ijms-22-02622-g001a.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索