• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过 MG-PCC 算法、t-SNE-SS 和 t-SNE-SG 图谱分析样本和基因的相似性。

Analyzing the similarity of samples and genes by MG-PCC algorithm, t-SNE-SS and t-SNE-SG maps.

机构信息

School of Mathematics, Southeast University, Nanjing, 210096, People's Republic of China.

Department of Mathematics, Nanjing Forestry University, Nanjing, 210037, People's Republic of China.

出版信息

BMC Bioinformatics. 2018 Dec 17;19(1):512. doi: 10.1186/s12859-018-2495-5.

DOI:10.1186/s12859-018-2495-5
PMID:30558536
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6296107/
Abstract

BACKGROUND

For analyzing these gene expression data sets under different samples, clustering and visualizing samples and genes are important methods. However, it is difficult to integrate clustering and visualizing techniques when the similarities of samples and genes are defined by PCC(Person correlation coefficient) measure.

RESULTS

Here, for rare samples of gene expression data sets, we use MG-PCC (mini-groups that are defined by PCC) algorithm to divide them into mini-groups, and use t-SNE-SSP maps to display these mini-groups, where the idea of MG-PCC algorithm is that the nearest neighbors should be in the same mini-groups, t-SNE-SSP map is selected from a series of t-SNE(t-statistic Stochastic Neighbor Embedding) maps of standardized samples, and these t-SNE maps have different perplexity parameter. Moreover, for PCC clusters of mass genes, they are displayed by t-SNE-SGI map, where t-SNE-SGI map is selected from a series of t-SNE maps of standardized genes, and these t-SNE maps have different initialization dimensions. Here, t-SNE-SSP and t-SNE-SGI maps are selected by A-value, where A-value is modeled from areas of clustering projections, and t-SNE-SSP and t-SNE-SGI maps are such t-SNE map that has the smallest A-value.

CONCLUSIONS

From the analysis of cancer gene expression data sets, we demonstrate that MG-PCC algorithm is able to put tumor and normal samples into their respective mini-groups, and t-SNE-SSP(or t-SNE-SGI) maps are able to display the relationships between mini-groups(or PCC clusters) clearly. Furthermore, t-SNE-SS(m)(or t-SNE-SG(n)) maps are able to construct independent tree diagrams of the nearest sample(or gene) neighbors, where each tree diagram is corresponding to a mini-group of samples(or genes).

摘要

背景

为了分析不同样本下的这些基因表达数据集,聚类和可视化样本和基因是重要的方法。然而,当使用 PCC(Person 相关系数)度量来定义样本和基因的相似性时,很难整合聚类和可视化技术。

结果

对于基因表达数据集的稀有样本,我们使用 MG-PCC(通过 PCC 定义的小组)算法将它们划分为小组,并使用 t-SNE-SSP 图谱来显示这些小组,其中 MG-PCC 算法的思想是最近的邻居应该在同一个小组中,t-SNE-SSP 图谱是从一系列标准化样本的 t-SNE(t 统计随机邻居嵌入)图谱中选择的,并且这些 t-SNE 图谱具有不同的困惑度参数。此外,对于大量基因的 PCC 聚类,它们通过 t-SNE-SGI 图谱显示,t-SNE-SGI 图谱是从一系列标准化基因的 t-SNE 图谱中选择的,并且这些 t-SNE 图谱具有不同的初始化维度。在这里,t-SNE-SSP 和 t-SNE-SGI 图谱是通过 A 值选择的,A 值是从聚类投影区域建模的,t-SNE-SSP 和 t-SNE-SGI 图谱是具有最小 A 值的 t-SNE 图谱。

结论

从癌症基因表达数据集的分析中,我们证明了 MG-PCC 算法能够将肿瘤和正常样本放入它们各自的小组中,t-SNE-SSP(或 t-SNE-SGI)图谱能够清晰地显示小组(或 PCC 聚类)之间的关系。此外,t-SNE-SS(m)(或 t-SNE-SG(n))图谱能够构建样本(或基因)最近邻居的独立树图,每个树图对应于一个样本小组。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/508864b03b01/12859_2018_2495_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/2872e7fd4ea0/12859_2018_2495_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/b6831ba908e5/12859_2018_2495_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/be10a90da8e6/12859_2018_2495_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/198df1d6cb0c/12859_2018_2495_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/e9b1fabfdefc/12859_2018_2495_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/a786fdc1b88e/12859_2018_2495_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/3394a8859a9b/12859_2018_2495_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/9517724eb85f/12859_2018_2495_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/508864b03b01/12859_2018_2495_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/2872e7fd4ea0/12859_2018_2495_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/b6831ba908e5/12859_2018_2495_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/be10a90da8e6/12859_2018_2495_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/198df1d6cb0c/12859_2018_2495_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/e9b1fabfdefc/12859_2018_2495_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/a786fdc1b88e/12859_2018_2495_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/3394a8859a9b/12859_2018_2495_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/9517724eb85f/12859_2018_2495_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e128/6296107/508864b03b01/12859_2018_2495_Fig9_HTML.jpg

相似文献

1
Analyzing the similarity of samples and genes by MG-PCC algorithm, t-SNE-SS and t-SNE-SG maps.通过 MG-PCC 算法、t-SNE-SS 和 t-SNE-SG 图谱分析样本和基因的相似性。
BMC Bioinformatics. 2018 Dec 17;19(1):512. doi: 10.1186/s12859-018-2495-5.
2
Multiple-cumulative probabilities used to cluster and visualize transcriptomes.用于对转录组进行聚类和可视化的多重累积概率。
FEBS Open Bio. 2017 Nov 13;7(12):2008-2020. doi: 10.1002/2211-5463.12327. eCollection 2017 Dec.
3
Sequential analysis of transcript expression patterns improves survival prediction in multiple cancers.转录表达模式的序贯分析提高了多种癌症的生存预测。
BMC Cancer. 2020 Apr 7;20(1):297. doi: 10.1186/s12885-020-06756-x.
4
Application of t-SNE to human genetic data.t-SNE在人类遗传数据中的应用。
J Bioinform Comput Biol. 2017 Aug;15(4):1750017. doi: 10.1142/S0219720017500172. Epub 2017 Jun 23.
5
Expression patterns of small numbers of transcripts from functionally-related pathways predict survival in multiple cancers.少数与功能相关途径的转录本的表达模式可预测多种癌症的生存情况。
BMC Cancer. 2019 Jul 12;19(1):686. doi: 10.1186/s12885-019-5851-6.
6
Data-driven identification of prognostic tumor subpopulations using spatially mapped t-SNE of mass spectrometry imaging data.使用质谱成像数据的空间映射t-SNE进行数据驱动的预后肿瘤亚群识别。
Proc Natl Acad Sci U S A. 2016 Oct 25;113(43):12244-12249. doi: 10.1073/pnas.1510227113. Epub 2016 Oct 10.
7
GPGPU Linear Complexity t-SNE Optimization.通用并行图形处理单元线性复杂度t-SNE优化
IEEE Trans Vis Comput Graph. 2020 Jan;26(1):1172-1181. doi: 10.1109/TVCG.2019.2934307. Epub 2019 Aug 23.
8
Visualization of genetic disease-phenotype similarities by multiple maps t-SNE with Laplacian regularization.通过具有拉普拉斯正则化的多重映射t-SNE可视化遗传疾病-表型相似性
BMC Med Genomics. 2014;7 Suppl 2(Suppl 2):S1. doi: 10.1186/1755-8794-7-S2-S1. Epub 2014 Oct 22.
9
Wrangling phosphoproteomic data to elucidate cancer signaling pathways.解析磷酸化蛋白质组学数据以阐明癌症信号通路。
PLoS One. 2013;8(1):e52884. doi: 10.1371/journal.pone.0052884. Epub 2013 Jan 3.
10
Current Projection Methods-Induced Biases at Subgroup Detection for Machine-Learning Based Data-Analysis of Biomedical Data.当前基于机器学习的生物医学数据分析中的子群检测的预测方法——诱导偏差。
Int J Mol Sci. 2019 Dec 20;21(1):79. doi: 10.3390/ijms21010079.

引用本文的文献

1
Few-shot genes selection: subset of PAM50 genes for breast cancer subtypes classification.少数基因选择:用于乳腺癌亚型分类的 PAM50 基因子集。
BMC Bioinformatics. 2024 Mar 1;25(1):92. doi: 10.1186/s12859-024-05715-8.
2
Constructing the boundary between potent and ineffective siRNAs by MG-algorithm with C-features.通过带有 C 特征的 MG 算法构建有效和无效 siRNA 之间的边界。
BMC Bioinformatics. 2022 Aug 13;23(1):337. doi: 10.1186/s12859-022-04867-9.
3
In Stallion Spermatozoa, Superoxide Dismutase (Cu-Zn) (SOD1) and the Aldo-Keto-Reductase Family 1 Member b (AKR1B1) Are the Proteins Most Significantly Reduced by Cryopreservation.

本文引用的文献

1
Heterogeneity within the PF-EPN-B ependymoma subgroup.PF-EPN-B 室管膜瘤亚组内的异质性。
Acta Neuropathol. 2018 Aug;136(2):227-237. doi: 10.1007/s00401-018-1888-x. Epub 2018 Jul 17.
2
Sorting Five Human Tumor Types Reveals Specific Biomarkers and Background Classification Genes.五种人类肿瘤类型的分类揭示了特定的生物标志物和背景分类基因。
Sci Rep. 2018 May 25;8(1):8180. doi: 10.1038/s41598-018-26310-x.
3
Multiple-cumulative probabilities used to cluster and visualize transcriptomes.用于对转录组进行聚类和可视化的多重累积概率。
在种马精子中,超氧化物歧化酶(Cu-Zn)(SOD1)和醛酮还原酶家族 1 成员 b(AKR1B1)是冷冻保存后显著减少的蛋白质。
J Proteome Res. 2021 May 7;20(5):2435-2446. doi: 10.1021/acs.jproteome.0c00932. Epub 2021 Mar 3.
FEBS Open Bio. 2017 Nov 13;7(12):2008-2020. doi: 10.1002/2211-5463.12327. eCollection 2017 Dec.
4
The biological knowledge discovery by PCCF measure and PCA-F projection.基于皮尔逊相关系数(PCCF)测度和主成分分析-因子分析(PCA-F)投影的生物知识发现
PLoS One. 2017 Apr 11;12(4):e0175104. doi: 10.1371/journal.pone.0175104. eCollection 2017.
5
Clustering cancer gene expression data by projective clustering ensemble.通过投影聚类集成对癌症基因表达数据进行聚类
PLoS One. 2017 Feb 24;12(2):e0171429. doi: 10.1371/journal.pone.0171429. eCollection 2017.
6
Detecting Clinically Meaningful Shape Clusters in Medical Image Data: Metrics Analysis for Hierarchical Clustering Applied to Healthy and Pathological Aortic Arches.在医学图像数据中检测具有临床意义的形状簇:应用于健康和病理性主动脉弓的层次聚类的指标分析
IEEE Trans Biomed Eng. 2017 Oct;64(10):2373-2383. doi: 10.1109/TBME.2017.2655364. Epub 2017 Feb 16.
7
BrainScope: interactive visual exploration of the spatial and temporal human brain transcriptome.BrainScope:人类大脑转录组时空的交互式视觉探索
Nucleic Acids Res. 2017 Jun 2;45(10):e83. doi: 10.1093/nar/gkx046.
8
Extensive Transcriptomic and Genomic Analysis Provides New Insights about Luminal Breast Cancers.广泛的转录组学和基因组分析为管腔型乳腺癌提供了新见解。
PLoS One. 2016 Jun 24;11(6):e0158259. doi: 10.1371/journal.pone.0158259. eCollection 2016.
9
A Study of the Comparability of External Criteria for Hierarchical Cluster Analysis.层次聚类分析外部准则的可比性研究
Multivariate Behav Res. 1986 Oct 1;21(4):441-58. doi: 10.1207/s15327906mbr2104_5.
10
An in silico analytical study of lung cancer and smokers datasets from gene expression omnibus (GEO) for prediction of differentially expressed genes.一项针对来自基因表达综合数据库(GEO)的肺癌和吸烟者数据集进行的计算机模拟分析研究,用于预测差异表达基因。
Bioinformation. 2015 May 28;11(5):229-35. doi: 10.6026/97320630011229. eCollection 2015.