Suppr超能文献

全基因组表达谱在人全血中用于潜伏性结核感染诊断的多队列研究

Genome-wide expression in human whole blood for diagnosis of latent tuberculosis infection: a multicohort research.

作者信息

Jiang Fan, Liu Yanhua, Li Linsheng, Ni Ruizi, An Yajing, Li Yufeng, Zhang Lingxia, Gong Wenping

机构信息

Institute of Tuberculosis Research, Senior Department of Tuberculosis, The Eighth Medical Center of PLA General Hospital, Beijing, China.

Section of Health, No. 94804 Unit of the Chinese People's Liberation Army, Shanghai, China.

出版信息

Front Microbiol. 2025 May 9;16:1584360. doi: 10.3389/fmicb.2025.1584360. eCollection 2025.

Abstract

BACKGROUND

Tuberculosis (TB) remains a significant global health challenge, necessitating reliable biomarkers for differentiation between latent tuberculosis infection (LTBI) and active tuberculosis (ATB). This study aimed to identify blood-based biomarkers differentiating LTBI from ATB through multicohort analysis of public datasets.

METHODS

We systematically screened 18 datasets from the NIH Gene Expression Omnibus (GEO), ultimately including 11 cohorts comprising 2,758 patients across 8 countries/regions and 13 ethnicities. Cohorts were stratified into training (8 cohorts,  = 1,933) and validation sets (3 cohorts,  = 825) based on functional assignment.

RESULTS

Through Upset analysis, LASSO (Least Absolute Shrinkage and Selection Operator), SVM-RFE (Support Vector Machine Recursive Feature Elimination), and MCL (Markov Cluster Algorithm) clustering of protein-protein interaction networks, we identified S100A12 and S100A8 as optimal biomarkers. A Naive Bayes (NB) model incorporating these two markers demonstrated robust diagnostic performance: training set AUC: median = 0.8572 (inter-quartile range 0.8002, 0.8708), validation AUC = 0.5719 (0.51645, 0.7078), and subgroup AUC = 0.8635 (0.8212, 0.8946).

CONCLUSION

Our multicohort analysis established an NB-based diagnostic model utilizing S100A12/S100A8, which maintains diagnostic accuracy across diverse geographic, ethnic, and clinical variables (including HIV co-infection), highlighting its potential for clinical translation in LTBI/ATB differentiation.

摘要

背景

结核病仍然是一项重大的全球健康挑战,需要可靠的生物标志物来区分潜伏性结核感染(LTBI)和活动性结核病(ATB)。本研究旨在通过对公共数据集的多队列分析,确定区分LTBI和ATB的血液生物标志物。

方法

我们系统筛选了美国国立卫生研究院基因表达综合数据库(GEO)中的18个数据集,最终纳入了11个队列,涵盖来自8个国家/地区、13个种族的2758名患者。根据功能分配,将队列分为训练集(8个队列,n = 1933)和验证集(3个队列,n = 825)。

结果

通过对蛋白质-蛋白质相互作用网络进行交集分析、最小绝对收缩和选择算子(LASSO)、支持向量机递归特征消除(SVM-RFE)以及马尔可夫聚类算法(MCL)聚类,我们确定S100A12和S100A8为最佳生物标志物。纳入这两种标志物的朴素贝叶斯(NB)模型显示出强大的诊断性能:训练集曲线下面积(AUC):中位数 = 0.8572(四分位间距0.8002,0.8708),验证集AUC = 0.5719(0.51645,0.7078),亚组AUC = 0.8635(0.8212,0.8946)。

结论

我们的多队列分析建立了一种基于NB的诊断模型,该模型利用S100A12/S100A8,在不同的地理、种族和临床变量(包括HIV合并感染)中均保持诊断准确性,突出了其在LTBI/ATB鉴别诊断中临床转化的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2c3/12101067/19761dbaa8e8/fmicb-16-1584360-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验