• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于谱形状和拓扑的基因组数据分析。

Genomics data analysis via spectral shape and topology.

机构信息

Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI, United States of America.

Department of Mathematics, University of Hawaii at Manoa, Honolulu, HI, United States of America.

出版信息

PLoS One. 2023 Apr 26;18(4):e0284820. doi: 10.1371/journal.pone.0284820. eCollection 2023.

DOI:10.1371/journal.pone.0284820
PMID:37099525
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10132553/
Abstract

Mapper, a topological algorithm, is frequently used as an exploratory tool to build a graphical representation of data. This representation can help to gain a better understanding of the intrinsic shape of high-dimensional genomic data and to retain information that may be lost using standard dimension-reduction algorithms. We propose a novel workflow to process and analyze RNA-seq data from tumor and healthy subjects integrating Mapper, differential gene expression, and spectral shape analysis. Precisely, we show that a Gaussian mixture approximation method can be used to produce graphical structures that successfully separate tumor and healthy subjects, and produce two subgroups of tumor subjects. A further analysis using DESeq2, a popular tool for the detection of differentially expressed genes, shows that these two subgroups of tumor cells bear two distinct gene regulations, suggesting two discrete paths for forming lung cancer, which could not be highlighted by other popular clustering methods, including t-distributed stochastic neighbor embedding (t-SNE). Although Mapper shows promise in analyzing high-dimensional data, tools to statistically analyze Mapper graphical structures are limited in the existing literature. In this paper, we develop a scoring method using heat kernel signatures that provides an empirical setting for statistical inferences such as hypothesis testing, sensitivity analysis, and correlation analysis.

摘要

Mapper 是一种拓扑算法,常用于构建数据的图形表示,作为探索性工具。这种表示形式可以帮助更好地理解高维基因组数据的固有形状,并保留使用标准降维算法可能丢失的信息。我们提出了一种新颖的工作流程,用于处理和分析肿瘤和健康受试者的 RNA-seq 数据,该流程集成了 Mapper、差异基因表达和光谱形状分析。具体来说,我们表明,高斯混合逼近方法可用于生成图形结构,成功地区分肿瘤和健康受试者,并生成肿瘤受试者的两个亚组。使用 DESeq2(一种用于检测差异表达基因的流行工具)进行的进一步分析表明,这两个肿瘤细胞亚组具有两种不同的基因调控,这表明形成肺癌有两种不同的途径,这两种途径无法通过其他流行的聚类方法(包括 t 分布随机邻域嵌入(t-SNE))突出显示。尽管 Mapper 在分析高维数据方面显示出前景,但在现有文献中,用于统计分析 Mapper 图形结构的工具是有限的。在本文中,我们使用热核签名开发了一种评分方法,为假设检验、敏感性分析和相关分析等统计推断提供了经验设置。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/ca602d87728f/pone.0284820.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/7662d6c50e88/pone.0284820.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/43be777ba34c/pone.0284820.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/a897783d3cdc/pone.0284820.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/580816ffb0b3/pone.0284820.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/9d1ae4d97e1a/pone.0284820.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/78495b6060bc/pone.0284820.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/68ba23422e9c/pone.0284820.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/fddf508f4b83/pone.0284820.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/9adb63c320a9/pone.0284820.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/00c011f3f31f/pone.0284820.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/4ce8a204a760/pone.0284820.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/ca602d87728f/pone.0284820.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/7662d6c50e88/pone.0284820.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/43be777ba34c/pone.0284820.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/a897783d3cdc/pone.0284820.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/580816ffb0b3/pone.0284820.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/9d1ae4d97e1a/pone.0284820.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/78495b6060bc/pone.0284820.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/68ba23422e9c/pone.0284820.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/fddf508f4b83/pone.0284820.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/9adb63c320a9/pone.0284820.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/00c011f3f31f/pone.0284820.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/4ce8a204a760/pone.0284820.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ebf/10132553/ca602d87728f/pone.0284820.g012.jpg

相似文献

1
Genomics data analysis via spectral shape and topology.基于谱形状和拓扑的基因组数据分析。
PLoS One. 2023 Apr 26;18(4):e0284820. doi: 10.1371/journal.pone.0284820. eCollection 2023.
2
Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis.基于半监督主成分分析的单细胞 RNA-seq 数据可视化
Int J Mol Sci. 2020 Aug 12;21(16):5797. doi: 10.3390/ijms21165797.
3
Topological Methods for Visualization and Analysis of High Dimensional Single-Cell RNA Sequencing Data.用于高维单细胞RNA测序数据可视化与分析的拓扑方法
Pac Symp Biocomput. 2019;24:350-361.
4
Spectral clustering of single cells using Siamese nerual network combined with improved affinity matrix.基于改进的相似性矩阵和孪生神经网络的单细胞光谱聚类。
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac113.
5
Weighted dimensionality reduction and robust Gaussian mixture model based cancer patient subtyping from gene expression data.基于加权降维和鲁棒高斯混合模型的基因表达数据癌症患者亚型分析。
J Biomed Inform. 2020 Dec;112:103620. doi: 10.1016/j.jbi.2020.103620. Epub 2020 Nov 11.
6
Visualization of Single Cell RNA-Seq Data Using t-SNE in R.使用 R 中的 t-SNE 可视化单细胞 RNA-Seq 数据。
Methods Mol Biol. 2020;2117:159-167. doi: 10.1007/978-1-0716-0301-7_8.
7
Incorporating prior biological knowledge for network-based differential gene expression analysis using differentially weighted graphical LASSO.利用差异加权图形套索法,将先验生物学知识纳入基于网络的差异基因表达分析。
BMC Bioinformatics. 2017 Feb 10;18(1):99. doi: 10.1186/s12859-017-1515-1.
8
A spectral clustering with self-weighted multiple kernel learning method for single-cell RNA-seq data.一种基于自加权多核学习的光谱聚类单细胞 RNA-seq 数据分析方法。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa216.
9
Performance comparison of dimensionality reduction methods on RNA-Seq data from the GTEx project.基于 GTEx 项目 RNA-Seq 数据的降维方法性能比较。
Genes Genomics. 2020 Feb;42(2):225-234. doi: 10.1007/s13258-019-00896-6. Epub 2019 Dec 12.
10
Differential expression analysis using a model-based gene clustering algorithm for RNA-seq data.基于模型的基因聚类算法在 RNA-seq 数据中的差异表达分析。
BMC Bioinformatics. 2021 Oct 20;22(1):511. doi: 10.1186/s12859-021-04438-4.

引用本文的文献

1
Top-DTI: integrating topological deep learning and large language models for drug-target interaction prediction.Top-DTI:整合拓扑深度学习和大语言模型用于药物-靶点相互作用预测
Bioinformatics. 2025 Jul 1;41(Supplement_1):i133-i141. doi: 10.1093/bioinformatics/btaf183.
2
Top-DTI: Integrating Topological Deep Learning and Large Language Models for Drug Target Interaction Prediction.Top-DTI:整合拓扑深度学习与大语言模型用于药物靶点相互作用预测
bioRxiv. 2025 Feb 8:2025.02.07.637146. doi: 10.1101/2025.02.07.637146.

本文引用的文献

1
Gene Set Knowledge Discovery with Enrichr.基因集知识发现与 Enrichr
Curr Protoc. 2021 Mar;1(3):e90. doi: 10.1002/cpz1.90.
2
Topological Methods for Visualization and Analysis of High Dimensional Single-Cell RNA Sequencing Data.用于高维单细胞RNA测序数据可视化与分析的拓扑方法
Pac Symp Biocomput. 2019;24:350-361.
3
Two-Tier Mapper, an unbiased topology-based clustering method for enhanced global gene expression analysis.双层映射器:一种基于无偏拓扑的聚类方法,用于增强全局基因表达分析。
Bioinformatics. 2019 Sep 15;35(18):3339-3347. doi: 10.1093/bioinformatics/btz052.
4
Unifying cancer and normal RNA sequencing data from different sources.整合来自不同来源的癌症和正常 RNA 测序数据。
Sci Data. 2018 Apr 17;5:180061. doi: 10.1038/sdata.2018.61.
5
Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening.基于机器学习打分和虚拟筛选的生物分子的代数拓扑表示。
PLoS Comput Biol. 2018 Jan 8;14(1):e1005929. doi: 10.1371/journal.pcbi.1005929. eCollection 2018 Jan.
6
Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction.用于蛋白质-配体结合亲和力预测的元素特异性持久同调与机器学习的整合
Int J Numer Method Biomed Eng. 2018 Feb;34(2). doi: 10.1002/cnm.2914. Epub 2017 Aug 16.
7
Single-cell topological RNA-seq analysis reveals insights into cellular differentiation and development.单细胞拓扑RNA测序分析揭示了细胞分化和发育的见解。
Nat Biotechnol. 2017 Jun;35(6):551-560. doi: 10.1038/nbt.3854. Epub 2017 May 1.
8
Identification of Copy Number Aberrations in Breast Cancer Subtypes Using Persistence Topology.使用持久拓扑学鉴定乳腺癌亚型中的拷贝数畸变
Microarrays (Basel). 2015 Aug 12;4(3):339-69. doi: 10.3390/microarrays4030339.
9
Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas.肺腺癌和肺鳞癌中体细胞基因组改变的不同模式。
Nat Genet. 2016 Jun;48(6):607-16. doi: 10.1038/ng.3564. Epub 2016 May 9.
10
Enrichr: a comprehensive gene set enrichment analysis web server 2016 update.Enrichr:一个全面的基因集富集分析网络服务器2016年更新版。
Nucleic Acids Res. 2016 Jul 8;44(W1):W90-7. doi: 10.1093/nar/gkw377. Epub 2016 May 3.