• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于多重结核病基因特征可重复性评估的基因集评分方法比较

Comparison of gene set scoring methods for reproducible evaluation of multiple tuberculosis gene signatures.

作者信息

Wang Xutao, VanValkenberg Arthur, Odom-Mabey Aubrey R, Ellner Jerrold J, Hochberg Natasha S, Salgame Padmini, Patil Prasad, Johnson W Evan

机构信息

Department of Biostatistics, Boston University, Boston, MA, USA.

Division of Computational Biomedicine and Bioinformatics Program, Boston University, Boston, MA, USA.

出版信息

bioRxiv. 2023 Jan 30:2023.01.19.520627. doi: 10.1101/2023.01.19.520627.

DOI:10.1101/2023.01.19.520627
PMID:36711818
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9882404/
Abstract

RATIONALE

Many blood-based transcriptional gene signatures for tuberculosis (TB) have been developed with potential use to diagnose disease, predict risk of progression from infection to disease, and monitor TB treatment outcomes. However, an unresolved issue is whether gene set enrichment analysis (GSEA) of the signature transcripts alone is sufficient for prediction and differentiation, or whether it is necessary to use the original statistical model created when the signature was derived. Intra-method comparison is complicated by the unavailability of original training data, missing details about the original trained model, and inadequate publicly-available software tools or source code implementing models. To facilitate these signatures' replicability and appropriate utilization in TB research, comprehensive comparisons between gene set scoring methods with cross-data validation of original model implementations are needed.

OBJECTIVES

We compared the performance of 19 TB gene signatures across 24 transcriptomic datasets using both re-rebuilt original models and gene set scoring methods to evaluate whether gene set scoring is a reasonable proxy to the performance of the original trained model. We have provided an open-access software implementation of the original models for all 19 signatures for future use.

METHODS

We considered existing gene set scoring and machine learning methods, including ssGSEA, GSVA, PLAGE, Singscore, and Zscore, as alternative approaches to profile gene signature performance. The sample-size-weighted mean area under the curve (AUC) value was computed to measure each signature's performance across datasets. Correlation analysis and Wilcoxon paired tests were used to analyze the performance of enrichment methods with the original models.

MEASUREMENT AND MAIN RESULTS

For many signatures, the predictions from gene set scoring methods were highly correlated and statistically equivalent to the results given by the original diagnostic models. PLAGE outperformed all other gene scoring methods. In some cases, PLAGE outperformed the original models when considering signatures' weighted mean AUC values and the AUC results within individual studies.

CONCLUSION

Gene set enrichment scoring of existing blood-based biomarker gene sets can distinguish patients with active TB disease from latent TB infection and other clinical conditions with equivalent or improved accuracy compared to the original methods and models. These data justify using gene set scoring methods of published TB gene signatures for predicting TB risk and treatment outcomes, especially when original models are difficult to apply or implement.

摘要

原理

许多用于结核病(TB)的基于血液的转录基因特征已被开发出来,具有诊断疾病、预测从感染到疾病进展的风险以及监测结核病治疗结果的潜在用途。然而,一个尚未解决的问题是,仅对特征转录本进行基因集富集分析(GSEA)是否足以进行预测和区分,还是有必要使用推导特征时创建的原始统计模型。由于原始训练数据不可用、原始训练模型的详细信息缺失以及实现模型的公开可用软件工具或源代码不足,方法内部的比较变得复杂。为了促进这些特征在结核病研究中的可重复性和适当利用,需要对基因集评分方法进行全面比较,并对原始模型实现进行交叉数据验证。

目的

我们使用重新构建的原始模型和基因集评分方法,比较了19种结核病基因特征在24个转录组数据集上的性能,以评估基因集评分是否是原始训练模型性能的合理替代指标。我们为所有19种特征提供了原始模型的开放获取软件实现,以供未来使用。

方法

我们考虑了现有的基因集评分和机器学习方法,包括单样本基因集富集分析(ssGSEA)、基因集变异分析(GSVA)、基于排列的基因集富集分析(PLAGE)、信号评分(Singscore)和Z评分,作为评估基因特征性能的替代方法。计算样本量加权平均曲线下面积(AUC)值,以衡量每个特征在各数据集上的性能。使用相关分析和Wilcoxon配对检验来分析富集方法与原始模型的性能。

测量与主要结果

对于许多特征,基因集评分方法的预测与原始诊断模型的结果高度相关且在统计学上等效。PLAGE的表现优于所有其他基因评分方法。在某些情况下,考虑特征的加权平均AUC值和各个研究中的AUC结果时,PLAGE的表现优于原始模型。

结论

与原始方法和模型相比,对现有的基于血液的生物标志物基因集进行基因集富集评分能够以相同或更高的准确性区分活动性结核病患者与潜伏性结核感染及其他临床情况。这些数据证明,使用已发表的结核病基因特征的基因集评分方法来预测结核病风险和治疗结果是合理的,特别是在原始模型难以应用或实施时。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16a0/9901241/52d2c612e756/nihpp-2023.01.19.520627v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16a0/9901241/daf3081b2cbe/nihpp-2023.01.19.520627v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16a0/9901241/52d2c612e756/nihpp-2023.01.19.520627v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16a0/9901241/daf3081b2cbe/nihpp-2023.01.19.520627v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16a0/9901241/52d2c612e756/nihpp-2023.01.19.520627v2-f0002.jpg

相似文献

1
Comparison of gene set scoring methods for reproducible evaluation of multiple tuberculosis gene signatures.用于多重结核病基因特征可重复性评估的基因集评分方法比较
bioRxiv. 2023 Jan 30:2023.01.19.520627. doi: 10.1101/2023.01.19.520627.
2
Comparison of gene set scoring methods for reproducible evaluation of tuberculosis gene signatures.比较基因集评分方法,以实现结核病基因特征的可重现性评估。
BMC Infect Dis. 2024 Jun 20;24(1):610. doi: 10.1186/s12879-024-09457-z.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
Single sample scoring of molecular phenotypes.单样本分子表型评分。
BMC Bioinformatics. 2018 Nov 6;19(1):404. doi: 10.1186/s12859-018-2435-4.
5
6
Multinomial modelling of TB/HIV co-infection yields a robust predictive signature and generates hypotheses about the HIV+TB+ disease state.对结核分枝杆菌/艾滋病毒合并感染进行多项式建模可产生稳健的预测特征,并对艾滋病毒/结核分枝杆菌/结核分枝杆菌病状态产生假设。
PLoS One. 2019 Jul 15;14(7):e0219322. doi: 10.1371/journal.pone.0219322. eCollection 2019.
7
Integration and validation of host transcript signatures, including a novel 3-transcript tuberculosis signature, to enable one-step multiclass diagnosis of childhood febrile disease.整合和验证宿主转录本特征,包括一种新的三转录本结核特征,以实现儿童发热性疾病的一步多类诊断。
J Transl Med. 2024 Aug 29;22(1):802. doi: 10.1186/s12967-024-05241-4.
8
Cross-platform comparison of immune signatures in immunotherapy-treated patients with advanced melanoma using a rank-based scoring approach.采用基于排名的评分方法,对接受免疫治疗的晚期黑色素瘤患者的免疫特征进行跨平台比较。
J Transl Med. 2023 Apr 13;21(1):257. doi: 10.1186/s12967-023-04092-9.
9
Comparing tuberculosis gene signatures in malnourished individuals using the TBSignatureProfiler.使用 TBSignatureProfiler 比较营养不良个体中的结核基因特征。
BMC Infect Dis. 2021 Jan 22;21(1):106. doi: 10.1186/s12879-020-05598-z.
10
Discovery and validation of a prognostic proteomic signature for tuberculosis progression: A prospective cohort study.发现和验证结核病进展的预后蛋白质组学特征:一项前瞻性队列研究。
PLoS Med. 2019 Apr 16;16(4):e1002781. doi: 10.1371/journal.pmed.1002781. eCollection 2019 Apr.