• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Omada:通过多重检验实现转录组的稳健聚类。

Omada: robust clustering of transcriptomes through multiple testing.

机构信息

Singapore Institute for Clinical Sciences, Agency for Science, Technology and Research (A*STAR), 30 Medical Dr, 117609, Singapore, Republic of Singapore.

Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), 30 Biopolis St, Matrix, 138671, Singapore, Republic of Singapore.

出版信息

Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae039.

DOI:10.1093/gigascience/giae039
PMID:38991852
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11238428/
Abstract

BACKGROUND

Cohort studies increasingly collect biosamples for molecular profiling and are observing molecular heterogeneity. High-throughput RNA sequencing is providing large datasets capable of reflecting disease mechanisms. Clustering approaches have produced a number of tools to help dissect complex heterogeneous datasets, but selecting the appropriate method and parameters to perform exploratory clustering analysis of transcriptomic data requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent. To address this, we have developed Omada, a suite of tools aiming to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning-based functions.

FINDINGS

The efficiency of each tool was tested with 7 datasets characterized by different expression signal strengths to capture a wide spectrum of RNA expression datasets. Our toolkit's decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Within datasets with less clear biological distinctions, our tools either formed stable subgroups with different expression profiles and robust clinical associations or revealed signs of problematic data such as biased measurements.

CONCLUSIONS

In conclusion, Omada successfully automates the robust unsupervised clustering of transcriptomic data, making advanced analysis accessible and reliable even for those without extensive machine learning expertise. Implementation of Omada is available at http://bioconductor.org/packages/omada/.

摘要

背景

越来越多的队列研究收集生物样本进行分子分析,并观察分子异质性。高通量 RNA 测序提供了能够反映疾病机制的大型数据集。聚类方法已经产生了许多工具来帮助剖析复杂的异质数据集,但选择适当的方法和参数来对转录组数据进行探索性聚类分析需要深入了解机器学习和广泛的计算实验。目前还没有不需要事先了解领域知识就能帮助做出此类决策的工具。为了解决这个问题,我们开发了 Omada,这是一套工具,旨在通过自动化机器学习功能来自动化这些过程,并使稳健的无监督聚类分析更易于使用转录组数据。

发现

我们使用 7 个具有不同表达信号强度的数据集来测试每个工具的效率,以捕获广泛的 RNA 表达数据集。我们工具包的决策反映了数据集中小组可识别的稳定分区的实际数量。在生物学差异不太明显的数据集内,我们的工具要么形成具有不同表达谱和稳健临床关联的稳定子组,要么显示出有问题的数据迹象,如有偏差的测量。

结论

总之,Omada 成功地自动化了转录组数据的稳健无监督聚类,即使对于没有广泛机器学习专业知识的人来说,也可以实现高级分析的便捷和可靠。Omada 的实现可在 http://bioconductor.org/packages/omada/ 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/3f6fd4d41e4f/giae039fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/cac37d23e403/giae039fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/e66221a8e097/giae039fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/62cbc68533a0/giae039fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/6ae51df1624c/giae039fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/944075ba6fda/giae039fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/3f6fd4d41e4f/giae039fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/cac37d23e403/giae039fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/e66221a8e097/giae039fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/62cbc68533a0/giae039fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/6ae51df1624c/giae039fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/944075ba6fda/giae039fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e8b/11238428/3f6fd4d41e4f/giae039fig6.jpg

相似文献

1
Omada: robust clustering of transcriptomes through multiple testing.Omada:通过多重检验实现转录组的稳健聚类。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae039.
2
SC3: consensus clustering of single-cell RNA-seq data.SC3:单细胞RNA测序数据的一致性聚类
Nat Methods. 2017 May;14(5):483-486. doi: 10.1038/nmeth.4236. Epub 2017 Mar 27.
3
Graph contrastive learning as a versatile foundation for advanced scRNA-seq data analysis.图对比学习作为高级 scRNA-seq 数据分析的多功能基础。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae558.
4
Using transfer learning from prior reference knowledge to improve the clustering of single-cell RNA-Seq data.利用先验参考知识的迁移学习来改进单细胞 RNA-Seq 数据的聚类。
Sci Rep. 2019 Dec 30;9(1):20353. doi: 10.1038/s41598-019-56911-z.
5
mbkmeans: Fast clustering for single cell data using mini-batch k-means.mbkmeans:使用小批量k均值算法对单细胞数据进行快速聚类。
PLoS Comput Biol. 2021 Jan 26;17(1):e1008625. doi: 10.1371/journal.pcbi.1008625. eCollection 2021 Jan.
6
scDFN: enhancing single-cell RNA-seq clustering with deep fusion networks.scDFN:利用深度融合网络增强单细胞 RNA-seq 聚类
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae486.
7
A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data.基于单细胞 RNA-seq 数据的混合深度聚类方法进行稳健的细胞类型分析。
RNA. 2020 Oct;26(10):1303-1319. doi: 10.1261/rna.074427.119. Epub 2020 Jun 12.
8
Identification of Cell Types from Single-Cell Transcriptomic Data.从单细胞转录组数据中识别细胞类型。
Methods Mol Biol. 2019;1935:45-77. doi: 10.1007/978-1-4939-9057-3_4.
9
An accessible, interactive GenePattern Notebook for analysis and exploration of single-cell transcriptomic data.一个用于分析和探索单细胞转录组数据的可访问的交互式基因模式笔记本。
F1000Res. 2018 Aug 16;7:1306. doi: 10.12688/f1000research.15830.2. eCollection 2018.
10
SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.SPARTA:用于基于参考的细菌RNA测序转录组自动分析的简单程序。
BMC Bioinformatics. 2016 Feb 4;17:66. doi: 10.1186/s12859-016-0923-y.

本文引用的文献

1
Comprehensive Analysis of Programmed Cell Death Signature in the Prognosis, Tumor Microenvironment and Drug Sensitivity in Lung Adenocarcinoma.肺腺癌预后、肿瘤微环境及药物敏感性中程序性细胞死亡特征的综合分析
Front Genet. 2022 May 18;13:900159. doi: 10.3389/fgene.2022.900159. eCollection 2022.
2
Integrative multi-omics database (iMOMdb) of Asian pregnant women.亚洲孕妇综合多组学数据库 (iMOMdb)。
Hum Mol Genet. 2022 Sep 10;31(18):3051-3067. doi: 10.1093/hmg/ddac079.
3
Biological heterogeneity in idiopathic pulmonary arterial hypertension identified through unsupervised transcriptomic profiling of whole blood.
通过全血无监督转录组分析确定的特发性肺动脉高压中的生物学异质性。
Nat Commun. 2021 Dec 7;12(1):7104. doi: 10.1038/s41467-021-27326-0.
4
DISCO: a database of Deeply Integrated human Single-Cell Omics data.DISCO:一个深度整合的人类单细胞组学数据数据库。
Nucleic Acids Res. 2022 Jan 7;50(D1):D596-D602. doi: 10.1093/nar/gkab1020.
5
RCA2: a scalable supervised clustering algorithm that reduces batch effects in scRNA-seq data.RCA2:一种可扩展的监督聚类算法,可减少 scRNA-seq 数据中的批次效应。
Nucleic Acids Res. 2021 Sep 7;49(15):8505-8519. doi: 10.1093/nar/gkab632.
6
Heterogeneity of Cardiovascular Disease Risk Factors Among Asian Immigrants: Insights From the 2010 to 2018 National Health Interview Survey.亚洲移民心血管疾病危险因素的异质性:2010 至 2018 年全国健康访谈调查的见解。
J Am Heart Assoc. 2021 Jul 6;10(13):e020408. doi: 10.1161/JAHA.120.020408. Epub 2021 Jun 29.
7
Integrated analysis of multimodal single-cell data.多模态单细胞数据的综合分析。
Cell. 2021 Jun 24;184(13):3573-3587.e29. doi: 10.1016/j.cell.2021.04.048. Epub 2021 May 31.
8
Single-cell profiling of tumor heterogeneity and the microenvironment in advanced non-small cell lung cancer.晚期非小细胞肺癌中肿瘤异质性和微环境的单细胞分析
Nat Commun. 2021 May 5;12(1):2540. doi: 10.1038/s41467-021-22801-0.
9
Molecular subtyping of Alzheimer's disease using RNA sequencing data reveals novel mechanisms and targets.利用 RNA 测序数据对阿尔茨海默病进行分子亚型分类揭示了新的机制和靶点。
Sci Adv. 2021 Jan 6;7(2). doi: 10.1126/sciadv.abb5398. Print 2021 Jan.
10
Impact of data preprocessing on cell-type clustering based on single-cell RNA-seq data.基于单细胞 RNA-seq 数据的细胞类型聚类中数据预处理的影响。
BMC Bioinformatics. 2020 Oct 7;21(1):440. doi: 10.1186/s12859-020-03797-8.