Suppr超能文献

整合多个批量或单细胞转录组研究时用于检测多类生物标志物的互信息

Mutual information for detecting multi-class biomarkers when integrating multiple bulk or single-cell transcriptomic studies.

作者信息

Zou Jian, Li Zheqi, Carleton Neil, Oesterreich Steffi, Lee Adrian V, Tseng George C

机构信息

Department of Statistics, School of Public Health, Chongqing Medical University, Chongqing, Chongqing 400016, China.

Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, United States.

出版信息

Bioinformatics. 2024 Nov 28;40(12). doi: 10.1093/bioinformatics/btae696.

Abstract

MOTIVATION

Biomarker detection plays a pivotal role in biomedical research. Integrating omics studies from multiple cohorts can enhance statistical power, accuracy, and robustness of the detection results. However, existing methods for horizontally combining omics studies are mostly designed for two-class scenarios (e.g. cases versus controls) and are not directly applicable for studies with multi-class design (e.g. samples from multiple disease subtypes, treatments, tissues, or cell types).

RESULTS

We propose a statistical framework, namely Mutual Information Concordance Analysis (MICA), to detect biomarkers with concordant multi-class expression pattern across multiple omics studies from an information theoretic perspective. Our approach first detects biomarkers with concordant multi-class patterns across partial or all of the omics studies using a global test by mutual information. A post hoc analysis is then performed for each detected biomarkers and identify studies with concordant pattern. Extensive simulations demonstrate improved accuracy and successful false discovery rate control of MICA compared to an existing multi-class correlation method. The method is then applied to two practical scenarios: four tissues of mouse metabolism-related transcriptomic studies, and three sources of estrogen treatment expression profiles. Detected biomarkers by MICA show intriguing biological insights and functional annotations. Additionally, we implemented MICA for single-cell RNA-Seq data for tumor progression biomarkers, highlighting critical roles of ribosomal function in the tumor microenvironment of triple-negative breast cancer and underscoring the potential of MICA for detecting novel therapeutic targets.

AVAILABILITY AND IMPLEMENTATION

The source code is available on Figshare at https://doi.org/10.6084/m9.figshare.27635436. Additionally, the R package can be installed directly from GitHub at https://github.com/jianzou75/MICA.

摘要

动机

生物标志物检测在生物医学研究中起着关键作用。整合来自多个队列的组学研究可以提高检测结果的统计功效、准确性和稳健性。然而,现有的水平组合组学研究的方法大多是针对两类情况(例如病例与对照)设计的,不适用于多类设计的研究(例如来自多种疾病亚型、治疗、组织或细胞类型的样本)。

结果

我们提出了一个统计框架,即互信息一致性分析(MICA),从信息论的角度检测跨多个组学研究具有一致多类表达模式的生物标志物。我们的方法首先使用互信息全局检验在部分或所有组学研究中检测具有一致多类模式的生物标志物。然后对每个检测到的生物标志物进行事后分析,并识别具有一致模式的研究。广泛的模拟表明,与现有的多类相关方法相比,MICA的准确性有所提高,且能成功控制错误发现率。该方法随后应用于两个实际场景:小鼠代谢相关转录组学研究的四种组织,以及雌激素治疗表达谱的三个来源。MICA检测到的生物标志物显示出有趣的生物学见解和功能注释。此外,我们将MICA应用于单细胞RNA测序数据以检测肿瘤进展生物标志物,突出了核糖体功能在三阴性乳腺癌肿瘤微环境中的关键作用,并强调了MICA在检测新治疗靶点方面的潜力。

可用性和实现

源代码可在Figshare上获取,网址为https://doi.org/10.6084/m9.figshare.27635436。此外,R包可以直接从GitHub上安装,网址为https://github.com/jianzou75/MICA。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验