• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于主题模型的质谱数据分析在癌症生物标志物发现研究中的应用

Topic model-based mass spectrometric data analysis in cancer biomarker discovery studies.

作者信息

Wang Minkun, Tsai Tsung-Heng, Di Poto Cristina, Ferrarini Alessia, Yu Guoqiang, Ressom Habtom W

机构信息

Department of Oncology, Georgetown University, 4000 Reservoir Rd NW, Washington D.C., USA.

Department of Electrical and Computer Engineering, Virginia Tech, 900 N Glebe Rd, Arlington, VA, USA.

出版信息

BMC Genomics. 2016 Aug 18;17 Suppl 4(Suppl 4):545. doi: 10.1186/s12864-016-2796-x.

DOI:10.1186/s12864-016-2796-x
PMID:27535232
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5001243/
Abstract

BACKGROUND

A fundamental challenge in quantitation of biomolecules for cancer biomarker discovery is owing to the heterogeneous nature of human biospecimens. Although this issue has been a subject of discussion in cancer genomic studies, it has not yet been rigorously investigated in mass spectrometry based proteomic and metabolomic studies. Purification of mass spectometric data is highly desired prior to subsequent analysis, e.g., quantitative comparison of the abundance of biomolecules in biological samples.

METHODS

We investigated topic models to computationally analyze mass spectrometric data considering both integrated peak intensities and scan-level features, i.e., extracted ion chromatograms (EICs). Probabilistic generative models enable flexible representation in data structure and infer sample-specific pure resources. Scan-level modeling helps alleviate information loss during data preprocessing. We evaluated the capability of the proposed models in capturing mixture proportions of contaminants and cancer profiles on LC-MS based serum proteomic and GC-MS based tissue metabolomic datasets acquired from patients with hepatocellular carcinoma (HCC) and liver cirrhosis as well as synthetic data we generated based on the serum proteomic data.

RESULTS

The results we obtained by analysis of the synthetic data demonstrated that both intensity-level and scan-level purification models can accurately infer the mixture proportions and the underlying true cancerous sources with small average error ratios (<7 %) between estimation and ground truth. By applying the topic model-based purification to mass spectrometric data, we found more proteins and metabolites with significant changes between HCC cases and cirrhotic controls. Candidate biomarkers selected after purification yielded biologically meaningful pathway analysis results and improved disease discrimination power in terms of the area under ROC curve compared to the results found prior to purification.

CONCLUSIONS

We investigated topic model-based inference methods to computationally address the heterogeneity issue in samples analyzed by LC/GC-MS. We observed that incorporation of scan-level features have the potential to lead to more accurate purification results by alleviating the loss in information as a result of integrating peaks. We believe cancer biomarker discovery studies that use mass spectrometric analysis of human biospecimens can greatly benefit from topic model-based purification of the data prior to statistical and pathway analyses.

摘要

背景

在癌症生物标志物发现的生物分子定量分析中,一个基本挑战源于人类生物样本的异质性。尽管这个问题在癌症基因组研究中一直是讨论的主题,但在基于质谱的蛋白质组学和代谢组学研究中尚未得到严格调查。在后续分析之前,例如对生物样品中生物分子丰度进行定量比较之前,非常需要对质谱数据进行纯化。

方法

我们研究了主题模型,以在考虑综合峰强度和扫描级特征(即提取离子色谱图(EIC))的情况下对质谱数据进行计算分析。概率生成模型能够在数据结构中进行灵活表示,并推断样本特异性的纯资源。扫描级建模有助于减轻数据预处理过程中的信息损失。我们评估了所提出模型在基于液相色谱 - 质谱的血清蛋白质组学和基于气相色谱 - 质谱的组织代谢组学数据集上捕获污染物混合比例和癌症特征的能力,这些数据集来自肝细胞癌(HCC)和肝硬化患者,以及我们基于血清蛋白质组数据生成的合成数据。

结果

我们对合成数据的分析结果表明,强度级和扫描级纯化模型都可以准确推断混合比例以及潜在的真实癌源,估计值与真实值之间的平均误差率较小(<7%)。通过将基于主题模型的纯化应用于质谱数据,我们发现肝癌病例和肝硬化对照之间有更多蛋白质和代谢物发生了显著变化。与纯化前的结果相比,纯化后选择的候选生物标志物产生了具有生物学意义的通路分析结果,并在ROC曲线下面积方面提高了疾病判别能力。

结论

我们研究了基于主题模型的推理方法,以通过计算解决液相/气相色谱 - 质谱分析样本中的异质性问题。我们观察到,纳入扫描级特征有可能通过减轻峰整合导致的信息损失而产生更准确的纯化结果。我们相信,对人类生物样本进行质谱分析的癌症生物标志物发现研究可以在统计和通路分析之前,从基于主题模型的数据纯化中大大受益。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/435b422a53aa/12864_2016_2796_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/f3744a9db27c/12864_2016_2796_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/7f89452c0254/12864_2016_2796_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/fce503da43ae/12864_2016_2796_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/0b285d36b270/12864_2016_2796_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/cf1cef205188/12864_2016_2796_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/154f07bc3f2c/12864_2016_2796_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/2bea94324af8/12864_2016_2796_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/06ff8b587075/12864_2016_2796_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/4c67d86fea1c/12864_2016_2796_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/c329fd525d7b/12864_2016_2796_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/435b422a53aa/12864_2016_2796_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/f3744a9db27c/12864_2016_2796_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/7f89452c0254/12864_2016_2796_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/fce503da43ae/12864_2016_2796_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/0b285d36b270/12864_2016_2796_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/cf1cef205188/12864_2016_2796_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/154f07bc3f2c/12864_2016_2796_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/2bea94324af8/12864_2016_2796_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/06ff8b587075/12864_2016_2796_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/4c67d86fea1c/12864_2016_2796_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/c329fd525d7b/12864_2016_2796_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db46/5001243/435b422a53aa/12864_2016_2796_Fig11_HTML.jpg

相似文献

1
Topic model-based mass spectrometric data analysis in cancer biomarker discovery studies.基于主题模型的质谱数据分析在癌症生物标志物发现研究中的应用
BMC Genomics. 2016 Aug 18;17 Suppl 4(Suppl 4):545. doi: 10.1186/s12864-016-2796-x.
2
Metabolomic Characterization of Hepatocellular Carcinoma in Patients with Liver Cirrhosis for Biomarker Discovery.肝硬化患者肝细胞癌的代谢组学特征分析以发现生物标志物
Cancer Epidemiol Biomarkers Prev. 2017 May;26(5):675-683. doi: 10.1158/1055-9965.EPI-16-0366. Epub 2016 Dec 2.
3
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
4
LC-MS based serum metabolomics for identification of hepatocellular carcinoma biomarkers in Egyptian cohort.基于 LC-MS 的血清代谢组学鉴定埃及队列中肝细胞癌的生物标志物。
J Proteome Res. 2012 Dec 7;11(12):5914-23. doi: 10.1021/pr300673x. Epub 2012 Nov 1.
5
Prediction of chronic hepatitis B, liver cirrhosis and hepatocellular carcinoma by SELDI-based serum decision tree classification.基于表面增强激光解吸电离技术的血清决策树分类法预测慢性乙型肝炎、肝硬化及肝细胞癌
J Cancer Res Clin Oncol. 2007 Nov;133(11):825-34. doi: 10.1007/s00432-007-0224-y. Epub 2007 May 22.
6
LC-MS profiling of N-Glycans derived from human serum samples for biomarker discovery in hepatocellular carcinoma.用于肝细胞癌生物标志物发现的人血清样本中N-聚糖的液相色谱-质谱分析
J Proteome Res. 2014 Nov 7;13(11):4859-68. doi: 10.1021/pr500460k. Epub 2014 Aug 8.
7
NMR and LC/MS-based global metabolomics to identify serum biomarkers differentiating hepatocellular carcinoma from liver cirrhosis.基于 NMR 和 LC/MS 的代谢组学分析鉴定血清生物标志物区分肝癌与肝硬化
Int J Cancer. 2014 Aug 1;135(3):658-68. doi: 10.1002/ijc.28706. Epub 2014 Jan 17.
8
INDEED: Integrated differential expression and differential network analysis of omic data for biomarker discovery.确实:用于生物标志物发现的组学数据的综合差异表达和差异网络分析。
Methods. 2016 Dec 1;111:12-20. doi: 10.1016/j.ymeth.2016.08.015. Epub 2016 Aug 31.
9
Utilization of metabolomics to identify serum biomarkers for hepatocellular carcinoma in patients with liver cirrhosis.利用代谢组学鉴定肝硬化患者肝细胞癌的血清生物标志物。
Anal Chim Acta. 2012 Sep 19;743:90-100. doi: 10.1016/j.aca.2012.07.013. Epub 2012 Jul 20.
10
Pseudotargeted metabolomics method and its application in serum biomarker discovery for hepatocellular carcinoma based on ultra high-performance liquid chromatography/triple quadrupole mass spectrometry.基于超高效液相色谱/三重四极杆质谱的伪靶标代谢组学方法及其在肝细胞癌血清生物标志物发现中的应用。
Anal Chem. 2013 Sep 3;85(17):8326-33. doi: 10.1021/ac4016787. Epub 2013 Aug 14.

引用本文的文献

1
Characterization of the transcriptional responses of Armillaria gallica 012m to GA3.描述蜜环菌 012m 对 GA3 的转录响应特征。
Arch Microbiol. 2023 Aug 18;205(9):308. doi: 10.1007/s00203-023-03621-w.

本文引用的文献

1
Bayesian Normalization Model for Label-Free Quantitative Analysis by LC-MS.用于液相色谱-质谱无标记定量分析的贝叶斯归一化模型
IEEE/ACM Trans Comput Biol Bioinform. 2015 Jul-Aug;12(4):914-27. doi: 10.1109/TCBB.2014.2377723.
2
GC-MS Based Plasma Metabolomics for Identification of Candidate Biomarkers for Hepatocellular Carcinoma in Egyptian Cohort.基于气相色谱-质谱联用的血浆代谢组学用于鉴定埃及队列中肝细胞癌的候选生物标志物
PLoS One. 2015 Jun 1;10(6):e0127299. doi: 10.1371/journal.pone.0127299. eCollection 2015.
3
LC-MS/MS-based serum proteomics for identification of candidate biomarkers for hepatocellular carcinoma.
基于液相色谱-串联质谱的血清蛋白质组学用于鉴定肝细胞癌的候选生物标志物
Proteomics. 2015 Jul;15(13):2369-81. doi: 10.1002/pmic.201400364. Epub 2015 Apr 29.
4
Intratumor molecular and phenotypic diversity in hepatocellular carcinoma.肝细胞癌中的肿瘤内分子和表型多样性。
Clin Cancer Res. 2015 Apr 15;21(8):1786-8. doi: 10.1158/1078-0432.CCR-14-2602. Epub 2015 Jan 27.
5
UNDO: a Bioconductor R package for unsupervised deconvolution of mixed gene expressions in tumor samples.UNDO:一个用于对肿瘤样本中混合基因表达进行无监督反卷积的Bioconductor R软件包。
Bioinformatics. 2015 Jan 1;31(1):137-9. doi: 10.1093/bioinformatics/btu607. Epub 2014 Sep 10.
6
LC-MS profiling of N-Glycans derived from human serum samples for biomarker discovery in hepatocellular carcinoma.用于肝细胞癌生物标志物发现的人血清样本中N-聚糖的液相色谱-质谱分析
J Proteome Res. 2014 Nov 7;13(11):4859-68. doi: 10.1021/pr500460k. Epub 2014 Aug 8.
7
Computational purification of individual tumor gene expression profiles leads to significant improvements in prognostic prediction.计算性纯化个体肿瘤基因表达谱可显著改善预后预测。
Genome Med. 2013 Mar 28;5(3):29. doi: 10.1186/gm433.
8
PERT: a method for expression deconvolution of human blood samples from varied microenvironmental and developmental conditions.PERT:一种用于从不同微环境和发育条件的人类血液样本中进行表达解卷积的方法。
PLoS Comput Biol. 2012;8(12):e1002838. doi: 10.1371/journal.pcbi.1002838. Epub 2012 Dec 20.
9
LC-MS based serum metabolomics for identification of hepatocellular carcinoma biomarkers in Egyptian cohort.基于 LC-MS 的血清代谢组学鉴定埃及队列中肝细胞癌的生物标志物。
J Proteome Res. 2012 Dec 7;11(12):5914-23. doi: 10.1021/pr300673x. Epub 2012 Nov 1.
10
Intratumor heterogeneity and branched evolution revealed by multiregion sequencing.多区域测序揭示的肿瘤内异质性和分支进化。
N Engl J Med. 2012 Mar 8;366(10):883-892. doi: 10.1056/NEJMoa1113205.