Suppr超能文献

PARTIES:一种基于分区水平集成的疾病亚型框架,利用多组学数据的扩散增强相似性。

PartIES: a disease subtyping framework with Partition-level Integration using diffusion-Enhanced Similarities from multi-omics Data.

机构信息

Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10027, United States.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae609.

Abstract

Integrating multi-omics data helps identify disease subtypes. Many similarity-based methods were developed for disease subtyping using multi-omics data, with many of them focusing on extracting common clustering structures across multiple types of omics data, but not preserving data-type-specific clustering structures. Moreover, clustering performance of similarity-based methods is affected when similarity measures are noisy. Here we proposed PartIES, a Partition-level Integration using diffusion-Enhanced Similarities to perform disease subtyping using multi-omics data. PartIES uses diffusion to reduce noises in individual similarity/kernel matrices from individual omics data types first, and then extract partition information from diffusion-enhanced similarity matrices and integrate the partition-level similarity through a weighted average iteratively. Simulation studies showed that (1) the diffusion step enhances clustering accuracy, and (2) PartIES outperforms competing methods, particularly when omics data types provide different clustering structures. Using mRNA, long noncoding RNAs, microRNAs expression data, DNA methylation data, and somatic mutation data from The Cancer Genome Atlas project, PartIES identified subtypes in bladder urothelial carcinoma, liver hepatocellular carcinoma, and thyroid carcinoma that are most significantly associated with patient survival across all methods. Further investigations suggested that among subtype-associated genes, many of those that are highly interacting with other genes are known important cancer genes. The identified cancer subtypes also have different activity levels for some known cancer-related pathways. The R code can be accessed at https://github.com/yuqimiao/PartIES.git.

摘要

整合多组学数据有助于识别疾病亚型。许多基于相似性的方法已被开发出来,用于使用多组学数据进行疾病亚型分类,其中许多方法侧重于提取多种类型的组学数据之间的常见聚类结构,但不保留特定于数据类型的聚类结构。此外,当相似性度量存在噪声时,基于相似性的方法的聚类性能会受到影响。在这里,我们提出了 PartIES,一种基于分区的集成方法,使用扩散增强的相似性来对多组学数据进行疾病亚型分类。PartIES 首先使用扩散来降低来自单个组学数据类型的单个相似性/核矩阵中的噪声,然后从扩散增强的相似性矩阵中提取分区信息,并通过加权平均迭代来整合分区级别的相似性。模拟研究表明:(1)扩散步骤提高了聚类准确性;(2)PartIES 优于竞争方法,尤其是在组学数据类型提供不同的聚类结构时。使用来自癌症基因组图谱项目的 mRNA、长非编码 RNA、microRNA 表达数据、DNA 甲基化数据和体细胞突变数据,PartIES 在膀胱癌、肝癌和甲状腺癌中识别出与所有方法相比与患者生存最相关的亚型。进一步的研究表明,在与亚型相关的基因中,许多与其他基因高度相互作用的基因是已知的重要癌症基因。鉴定出的癌症亚型在一些已知的癌症相关途径中也具有不同的活性水平。R 代码可在 https://github.com/yuqimiao/PartIES.git 访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae94/11586768/cf59d28a00d8/bbae609f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验