• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

条件t-SNE:更具信息性的t-SNE嵌入

Conditional t-SNE: more informative t-SNE embeddings.

作者信息

Kang Bo, García García Darío, Lijffijt Jefrey, Santos-Rodríguez Raúl, De Bie Tijl

机构信息

Department of Electronics and Information Systems, IDLab, Ghent University, Ghent, Belgium.

Facebook AI, New York, USA.

出版信息

Mach Learn. 2021;110(10):2905-2940. doi: 10.1007/s10994-020-05917-0. Epub 2020 Dec 6.

DOI:10.1007/s10994-020-05917-0
PMID:34840420
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8599264/
Abstract

Dimensionality reduction and manifold learning methods such as t-distributed stochastic neighbor embedding (t-SNE) are frequently used to map high-dimensional data into a two-dimensional space to visualize and explore that data. Going beyond the specifics of t-SNE, there are two substantial limitations of any such approach: (1) not all information can be captured in a single two-dimensional embedding, and (2) to well-informed users, the salient structure of such an embedding is often already known, preventing that any real new insights can be obtained. Currently, it is not known how to extract the remaining information in a similarly effective manner. We introduce (ct-SNE), a generalization of t-SNE that discounts prior information in the form of labels. This enables obtaining more informative and more relevant embeddings. To achieve this, we propose a conditioned version of the t-SNE objective, obtaining an elegant method with a single integrated objective. We show how to efficiently optimize the objective and study the effects of the extra parameter that ct-SNE has over t-SNE. Qualitative and quantitative empirical results on synthetic and real data show ct-SNE is scalable, effective, and achieves its goal: it allows complementary structure to be captured in the embedding and provided new insights into real data.

摘要

降维和流形学习方法,如t分布随机邻域嵌入(t-SNE),经常被用于将高维数据映射到二维空间,以可视化和探索这些数据。除了t-SNE的具体细节之外,任何此类方法都存在两个重大局限性:(1)并非所有信息都能在单个二维嵌入中被捕获,(2)对于见多识广的用户来说,这种嵌入的显著结构往往已经为人所知,这使得无法获得任何真正新的见解。目前,尚不清楚如何以类似有效的方式提取剩余信息。我们引入了条件t-SNE(ct-SNE),它是t-SNE的一种推广,以标签的形式对先验信息进行了折扣。这使得能够获得更具信息性和相关性的嵌入。为了实现这一点,我们提出了t-SNE目标的条件版本,得到了一种具有单一集成目标的优雅方法。我们展示了如何有效地优化该目标,并研究了ct-SNE相对于t-SNE的额外参数的影响。在合成数据和真实数据上进行的定性和定量实证结果表明,ct-SNE具有可扩展性、有效性,并实现了其目标:它允许在嵌入中捕获互补结构,并为真实数据提供新的见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/05399e3dfd07/10994_2020_5917_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/b1f5c196eabe/10994_2020_5917_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/6fdead7d5f75/10994_2020_5917_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/3a639888fed8/10994_2020_5917_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/522486b26340/10994_2020_5917_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/6aacdad4e885/10994_2020_5917_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/ddb7d7ca29ca/10994_2020_5917_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/3add9c5d9e03/10994_2020_5917_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/0d9319df3c3c/10994_2020_5917_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/c6944c801a10/10994_2020_5917_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/05399e3dfd07/10994_2020_5917_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/b1f5c196eabe/10994_2020_5917_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/6fdead7d5f75/10994_2020_5917_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/3a639888fed8/10994_2020_5917_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/522486b26340/10994_2020_5917_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/6aacdad4e885/10994_2020_5917_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/ddb7d7ca29ca/10994_2020_5917_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/3add9c5d9e03/10994_2020_5917_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/0d9319df3c3c/10994_2020_5917_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/c6944c801a10/10994_2020_5917_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b266/8599264/05399e3dfd07/10994_2020_5917_Fig10_HTML.jpg

相似文献

1
Conditional t-SNE: more informative t-SNE embeddings.条件t-SNE:更具信息性的t-SNE嵌入
Mach Learn. 2021;110(10):2905-2940. doi: 10.1007/s10994-020-05917-0. Epub 2020 Dec 6.
2
Shape-aware stochastic neighbor embedding for robust data visualisations.形状感知随机近邻嵌入的稳健数据可视化。
BMC Bioinformatics. 2022 Nov 14;23(1):477. doi: 10.1186/s12859-022-05028-8.
3
Statistical method scDEED for detecting dubious 2D single-cell embeddings and optimizing t-SNE and UMAP hyperparameters.用于检测可疑的 2D 单细胞嵌入并优化 t-SNE 和 UMAP 参数的统计方法 scDEED。
Nat Commun. 2024 Feb 26;15(1):1753. doi: 10.1038/s41467-024-45891-y.
4
Time-Lagged t-Distributed Stochastic Neighbor Embedding (t-SNE) of Molecular Simulation Trajectories.分子模拟轨迹的时间滞后t分布随机邻域嵌入(t-SNE)
Front Mol Biosci. 2020 Jun 30;7:132. doi: 10.3389/fmolb.2020.00132. eCollection 2020.
5
A Preprocessing Manifold Learning Strategy Based on t-Distributed Stochastic Neighbor Embedding.一种基于t分布随机邻域嵌入的预处理流形学习策略
Entropy (Basel). 2023 Jul 14;25(7):1065. doi: 10.3390/e25071065.
6
Multi-view data visualisation manifold learning.多视图数据可视化 流形学习
PeerJ Comput Sci. 2024 May 24;10:e1993. doi: 10.7717/peerj-cs.1993. eCollection 2024.
7
Stochastic neighbor embedding as a tool for visualizing the encoding capability of magnetic resonance fingerprinting dictionaries.随机邻居嵌入作为一种用于可视化磁共振指纹字典编码能力的工具。
MAGMA. 2022 Apr;35(2):223-234. doi: 10.1007/s10334-021-00963-8. Epub 2021 Oct 23.
8
Self-Organizing Nebulous Growths for Robust and Incremental Data Visualization.用于稳健且增量式数据可视化的自组织星云状生长
IEEE Trans Neural Netw Learn Syst. 2021 Oct;32(10):4588-4602. doi: 10.1109/TNNLS.2020.3023941. Epub 2021 Oct 5.
9
t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections.t-viSNE:t-SNE投影的交互式评估与解读
IEEE Trans Vis Comput Graph. 2020 Aug;26(8):2696-2714. doi: 10.1109/TVCG.2020.2986996. Epub 2020 Apr 13.
10
A t-SNE Based Classification Approach to Compositional Microbiome Data.一种基于t-SNE的微生物群落组成数据分类方法。
Front Genet. 2020 Dec 14;11:620143. doi: 10.3389/fgene.2020.620143. eCollection 2020.

引用本文的文献

1
Understanding the ancient classic and famous prescriptions the property of Chinese materia medica.解读古代经典名方与中药药性。
Front Pharmacol. 2025 May 12;16:1551531. doi: 10.3389/fphar.2025.1551531. eCollection 2025.
2
Intelligent Pattern Recognition Using Distributed Fiber Optic Sensors for Smart Environment.用于智能环境的基于分布式光纤传感器的智能模式识别
Sensors (Basel). 2024 Dec 25;25(1):47. doi: 10.3390/s25010047.
3
Machine learning insights on the effectiveness of non-pharmaceutical interventions against COVID-19 in Nigeria.

本文引用的文献

1
DimReader: Axis lines that explain non-linear projections.DimReader:解释非线性投影的轴线。
IEEE Trans Vis Comput Graph. 2018 Aug 20. doi: 10.1109/TVCG.2018.2865194.
2
Approximated and User Steerable tSNE for Progressive Visual Analytics.渐进式可视分析的近似和用户可引导 t-SNE。
IEEE Trans Vis Comput Graph. 2017 Jul;23(7):1739-1752. doi: 10.1109/TVCG.2016.2570755. Epub 2016 May 19.
3
node2vec: Scalable Feature Learning for Networks.节点2向量:网络的可扩展特征学习
机器学习对尼日利亚非药物干预措施抗击新冠肺炎有效性的见解。
Int Health. 2025 Sep 3;17(5):809-819. doi: 10.1093/inthealth/ihae065.
4
Historical insights at scale: A corpus-wide machine learning analysis of early modern astronomic tables.大规模历史洞察:对早期现代天文表的全语料库机器学习分析
Sci Adv. 2024 Oct 25;10(43):eadj1719. doi: 10.1126/sciadv.adj1719. Epub 2024 Oct 23.
5
Recent deep learning-based brain tumor segmentation models using multi-modality magnetic resonance imaging: a prospective survey.近期基于深度学习的使用多模态磁共振成像的脑肿瘤分割模型:一项前瞻性调查。
Front Bioeng Biotechnol. 2024 Jul 22;12:1392807. doi: 10.3389/fbioe.2024.1392807. eCollection 2024.
6
Food Chemicals and Epigenetic Targets: An Epi Food Chemical Database.食品化学物质与表观遗传靶点:一个表观遗传食品化学物质数据库。
ACS Omega. 2024 May 29;9(23):25322-25331. doi: 10.1021/acsomega.4c03321. eCollection 2024 Jun 11.
7
A Machine Learning Approach to Assess Patients with Deep Neck Infection Progression to Descending Mediastinitis: Preliminary Results.一种用于评估深部颈部感染进展为下行性纵隔炎患者的机器学习方法:初步结果。
Diagnostics (Basel). 2023 Aug 23;13(17):2736. doi: 10.3390/diagnostics13172736.
8
Clustering Analysis, Structure Fingerprint Analysis, and Quantum Chemical Calculations of Compounds from Essential Oils of Sunflower L.) Receptacles.葵花(Helianthus annuus L.)托盘中精油化合物的聚类分析、结构指纹分析和量子化学计算。
Int J Mol Sci. 2022 Sep 5;23(17):10169. doi: 10.3390/ijms231710169.
9
Research on E-Commerce Database Marketing Based on Machine Learning Algorithm.基于机器学习算法的电子商务数据库营销研究。
Comput Intell Neurosci. 2022 Jun 29;2022:7973446. doi: 10.1155/2022/7973446. eCollection 2022.
KDD. 2016 Aug;2016:855-864. doi: 10.1145/2939672.2939754.
4
Probing Projections: Interaction Techniques for Interpreting Arrangements and Errors of Dimensionality Reductions.探测投影:解释降维排列和误差的交互技术。
IEEE Trans Vis Comput Graph. 2016 Jan;22(1):629-38. doi: 10.1109/TVCG.2015.2467717. Epub 2015 Aug 12.
5
A global geometric framework for nonlinear dimensionality reduction.一种用于非线性降维的全局几何框架。
Science. 2000 Dec 22;290(5500):2319-23. doi: 10.1126/science.290.5500.2319.
6
Multidimensional scaling of similarity.相似度的多维缩放
Psychometrika. 1965 Dec;30(4):379-93. doi: 10.1007/BF02289530.