• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

梳理:肿瘤学中的聚类分析用于新型基因特征的数学与生物学识别

COMBING: Clustering in Oncology for Mathematical and Biological Identification of Novel Gene Signatures.

作者信息

Battistella Enzo, Vakalopoulou Maria, Sun Roger, Estienne Theo, Lerousseau Marvin, Nikolaev Sergey, Andres Emilie Alvarez, Carre Alexandre, Niyoteka Stephane, Robert Charlotte, Paragios Nikos, Deutsch Eric

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3317-3331. doi: 10.1109/TCBB.2021.3123910. Epub 2022 Dec 8.

DOI:10.1109/TCBB.2021.3123910
PMID:34714749
Abstract

Precision medicine is a paradigm shift in healthcare relying heavily on genomics data. However, the complexity of biological interactions, the large number of genes as well as the lack of comparisons on the analysis of data, remain a tremendous bottleneck regarding clinical adoption. In this paper, we introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers. Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm. It offers modularity as concerns metric functions and scalability, while being able to automatically determine the best number of clusters. Our evaluation includes both mathematical and biological criteria to define a quantitative metric. The recovered signature is applied to a variety of biological tasks, including screening of biological pathways and functions, and characterization relevance on tumor types and subtypes. Quantitative comparisons among different distance metrics, commonly used clustering methods and a referential gene signature used in the literature, confirm state of the art performance of our approach. In particular, our signature, based on 27 genes, reports at least 30 times better mathematical significance (average Dunn's Index) and 25% better biological significance (average Enrichment in Protein-Protein Interaction) than those produced by other referential clustering methods. Finally, our signature reports promising results on distinguishing immune inflammatory and immune desert tumors, while reporting a high balanced accuracy of 92% on tumor types classification and averaged balanced accuracy of 68% on tumor subtypes classification, which represents, respectively 7% and 9% higher performance compared to the referential signature.

摘要

精准医学是医疗保健领域的一次范式转变,严重依赖基因组学数据。然而,生物相互作用的复杂性、大量的基因以及数据缺乏比较分析,仍然是临床应用的巨大瓶颈。在本文中,我们介绍了一种新颖的、自动的和无监督的框架来发现低维基因生物标志物。我们的方法基于LP-Stability算法,这是一种基于高维中心的无监督聚类算法。它在度量函数和可扩展性方面具有模块化,同时能够自动确定最佳聚类数。我们的评估包括数学和生物学标准来定义定量指标。恢复的特征被应用于各种生物学任务,包括生物途径和功能的筛选,以及肿瘤类型和亚型的特征相关性。不同距离度量、常用聚类方法和文献中使用的参考基因特征之间的定量比较,证实了我们方法的先进性能。特别是,我们基于27个基因的特征,在数学意义(平均邓恩指数)上比其他参考聚类方法产生的特征至少好30倍,在生物学意义(蛋白质-蛋白质相互作用中的平均富集)上好25%。最后,我们的特征在区分免疫炎症和免疫沙漠肿瘤方面取得了有希望的结果,同时在肿瘤类型分类上报告了92%的高平衡准确率,在肿瘤亚型分类上报告了68%的平均平衡准确率,分别比参考特征高出7%和9%的性能。

相似文献

1
COMBING: Clustering in Oncology for Mathematical and Biological Identification of Novel Gene Signatures.梳理:肿瘤学中的聚类分析用于新型基因特征的数学与生物学识别
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3317-3331. doi: 10.1109/TCBB.2021.3123910. Epub 2022 Dec 8.
2
Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes.使用功能类别参考集评估基因表达数据聚类算法的方法。
BMC Bioinformatics. 2006 Aug 31;7:397. doi: 10.1186/1471-2105-7-397.
3
Clustering gene expression data using a diffraction-inspired framework.基于衍射启发式框架的基因表达数据聚类。
Biomed Eng Online. 2012 Nov 19;11:85. doi: 10.1186/1475-925X-11-85.
4
Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis.使用判别向量量化的同时分类与特征聚类及其在微阵列数据分析中的应用
Proc IEEE Comput Soc Bioinform Conf. 2002;1:246-55.
5
Simultaneous gene clustering and subset selection for sample classification via MDL.通过最小描述长度实现用于样本分类的同步基因聚类和子集选择
Bioinformatics. 2003 Jun 12;19(9):1100-9. doi: 10.1093/bioinformatics/btg039.
6
New algorithms for multi-class cancer diagnosis using tumor gene expression signatures.使用肿瘤基因表达特征进行多类别癌症诊断的新算法。
Bioinformatics. 2003 Sep 22;19(14):1800-7. doi: 10.1093/bioinformatics/btg238.
7
A systematic comparison of data- and knowledge-driven approaches to disease subtype discovery.基于数据和知识的疾病亚型发现方法的系统比较。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab314.
8
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
9
Metric for measuring the effectiveness of clustering of DNA microarray expression.用于测量 DNA 微阵列表达聚类有效性的度量。
BMC Bioinformatics. 2006 Sep 6;7 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2105-7-S2-S5.
10
Evaluation of clustering algorithms for gene expression data.基因表达数据聚类算法的评估
BMC Bioinformatics. 2006 Dec 12;7 Suppl 4(Suppl 4):S17. doi: 10.1186/1471-2105-7-S4-S17.

引用本文的文献

1
Convergence of evolving artificial intelligence and machine learning techniques in precision oncology.不断发展的人工智能和机器学习技术在精准肿瘤学中的融合。
NPJ Digit Med. 2025 Jan 31;8(1):75. doi: 10.1038/s41746-025-01471-y.
2
Research on Artificial-Intelligence-Assisted Medicine: A Survey on Medical Artificial Intelligence.人工智能辅助医学研究:医学人工智能综述
Diagnostics (Basel). 2024 Jul 9;14(14):1472. doi: 10.3390/diagnostics14141472.
3
Improving the performance and interpretability on medical datasets using graphical ensemble feature selection.
使用图形集成特征选择提高医学数据集的性能和可解释性。
Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae341.