原发性和转移性肿瘤转录组数据的可视化聚类——相关性和新的陷阱。

Visual Clustering of Transcriptomic Data from Primary and Metastatic Tumors-Dependencies and Novel Pitfalls.

机构信息

Institute of Pathology, Klinikum Stuttgart, 70174 Stuttgart, Germany.

Institute of Pathology, University of Würzburg, 97080 Würzburg, Germany.

出版信息

Genes (Basel). 2022 Jul 26;13(8):1335. doi: 10.3390/genes13081335.

DOI:10.3390/genes13081335

PMID:35893071

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9394300/

Abstract

Personalized oncology is a rapidly evolving area and offers cancer patients therapy options that are more specific than ever. However, there is still a lack of understanding regarding transcriptomic similarities or differences of metastases and corresponding primary sites. Applying two unsupervised dimension reduction methods (t-Distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP)) on three datasets of metastases ( =682 samples) with three different data transformations (unprocessed, log10 as well as log10 + 1 transformed values), we visualized potential underlying clusters. Additionally, we analyzed two datasets ( =616 samples) containing metastases and primary tumors of one entity, to point out potential familiarities. Using these methods, no tight link between the site of resection and cluster formation outcome could be demonstrated, or for datasets consisting of solely metastasis or mixed datasets. Instead, dimension reduction methods and data transformation significantly impacted visual clustering results. Our findings strongly suggest data transformation to be considered as another key element in the interpretation of visual clustering approaches along with initialization and different parameters. Furthermore, the results highlight the need for a more thorough examination of parameters used in the analysis of clusters.

摘要

个性化肿瘤学是一个快速发展的领域，为癌症患者提供了比以往任何时候都更具针对性的治疗选择。然而，对于转移灶和相应的原发部位的转录组相似性或差异性仍缺乏了解。我们应用两种无监督降维方法（t 分布随机近邻嵌入（t-SNE）和一致流形逼近和投影（UMAP））对三个转移数据集（=682 个样本）进行分析，这三个数据集采用了三种不同的数据转换方式（未处理、以 10 为底的对数以及以 10 为底的对数加 1 的转换值），以可视化潜在的潜在聚类。此外，我们分析了两个包含同一实体的转移灶和原发灶的数据集（=616 个样本），以指出潜在的相似性。使用这些方法，我们不能证明切除部位与聚类形成结果之间存在紧密联系，也不能证明仅包含转移灶或混合数据集的数据集之间存在紧密联系。相反，降维方法和数据转换对可视化聚类结果有显著影响。我们的研究结果强烈表明，在解释可视化聚类方法时，除了初始化和不同的参数外，还应将数据转换视为另一个关键因素。此外，结果突出表明需要更彻底地检查聚类分析中使用的参数。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

原发性和转移性肿瘤转录组数据的可视化聚类——相关性和新的陷阱。

Visual Clustering of Transcriptomic Data from Primary and Metastatic Tumors-Dependencies and Novel Pitfalls.

机构信息

出版信息

相似文献

本文引用的文献

原发性和转移性肿瘤转录组数据的可视化聚类——相关性和新的陷阱。

Visual Clustering of Transcriptomic Data from Primary and Metastatic Tumors-Dependencies and Novel Pitfalls.

机构信息

出版信息

相似文献

本文引用的文献