自适应捕获癌症生物标志物识别中的表达异质性。

Adaptively capturing the heterogeneity of expression for cancer biomarker identification.

机构信息

School of Mathematics and Physics, Anhui Jianzhu University, Hefei, 230022, Anhui, China.

Institute of Intelligent Machines, Hefei Institutes of Physical Science, CAS, 350 Shushanhu Road, P.O.Box 1130, Hefei, 230031, Anhui, China.

出版信息

BMC Bioinformatics. 2018 Nov 3;19(1):401. doi: 10.1186/s12859-018-2437-2.

DOI:10.1186/s12859-018-2437-2

PMID:30390627

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6215657/

Abstract

BACKGROUND

Identifying cancer biomarkers from transcriptomics data is of importance to cancer research. However, transcriptomics data are often complex and heterogeneous, which complicates the identification of cancer biomarkers in practice. Currently, the heterogeneity still remains a challenge for detecting subtle but consistent changes of gene expression in cancer cells.

RESULTS

In this paper, we propose to adaptively capture the heterogeneity of expression across samples in a gene regulation space instead of in a gene expression space. Specifically, we transform gene expression profiles into gene regulation profiles and mathematically formulate gene regulation probabilities (GRPs)-based statistics for characterizing differential expression of genes between tumor and normal tissues. Finally, an unbiased estimator (aGRP) of GRPs is devised that can interrogate and adaptively capture the heterogeneity of gene expression. We also derived an asymptotical significance analysis procedure for the new statistic. Since no parameter needs to be preset, aGRP is easy and friendly to use for researchers without computer programming background. We evaluated the proposed method on both simulated data and real-world data and compared with previous methods. Experimental results demonstrated the superior performance of the proposed method in exploring the heterogeneity of expression for capturing subtle but consistent alterations of gene expression in cancer.

CONCLUSIONS

Expression heterogeneity largely influences the performance of cancer biomarker identification from transcriptomics data. Models are needed that efficiently deal with the expression heterogeneity. The proposed method can be a standalone tool due to its capacity of adaptively capturing the sample heterogeneity and the simplicity in use.

SOFTWARE AVAILABILITY

The source code of aGRP can be downloaded from https://github.com/hqwang126/aGRP .

摘要

背景

从转录组学数据中识别癌症生物标志物对于癌症研究非常重要。然而，转录组学数据通常较为复杂且具有异质性，这使得在实践中识别癌症生物标志物变得更加复杂。目前，这种异质性仍然是检测癌细胞中微妙但一致的基因表达变化的一个挑战。

结果

在本文中，我们提出了一种方法，即在基因调控空间而不是在基因表达空间中自适应地捕捉样本间的表达异质性。具体来说，我们将基因表达谱转换为基因调控谱，并通过数学公式推导出基于基因调控概率（GRP）的统计量，用于描述肿瘤和正常组织之间基因的差异表达。最后，我们设计了一种无偏估计量（aGRP）来估计 GRP，该估计量可以探测和自适应地捕捉基因表达的异质性。我们还推导出了该新统计量的渐近显著性分析过程。由于无需预设参数，aGRP 易于使用，适合没有计算机编程背景的研究人员。我们在模拟数据和真实数据上评估了所提出的方法，并与之前的方法进行了比较。实验结果表明，该方法在探索表达异质性以捕捉癌症中基因表达的微妙但一致变化方面具有优越的性能。

结论

表达异质性极大地影响了从转录组学数据中识别癌症生物标志物的性能。需要开发能够有效处理表达异质性的模型。由于具有自适应捕捉样本异质性的能力和简单易用的特点，所提出的方法可以作为一个独立的工具。

软件可用性

aGRP 的源代码可以从 https://github.com/hqwang126/aGRP 下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db2c/6215657/7fc7aae0f2cc/12859_2018_2437_Fig1_HTML.jpg

相似文献

Adaptively capturing the heterogeneity of expression for cancer biomarker identification.

BMC Bioinformatics. 2018 Nov 3;19(1):401. doi: 10.1186/s12859-018-2437-2.

A regulation probability model-based meta-analysis of multiple transcriptomics data sets for cancer biomarker identification.

BMC Bioinformatics. 2017 Aug 23;18(1):375. doi: 10.1186/s12859-017-1794-6.

Splice Expression Variation Analysis (SEVA) for inter-tumor heterogeneity of gene isoform usage in cancer.

Bioinformatics. 2018 Jun 1;34(11):1859-1867. doi: 10.1093/bioinformatics/bty004.

jNMFMA: a joint non-negative matrix factorization meta-analysis of transcriptomics data.

Bioinformatics. 2015 Feb 15;31(4):572-80. doi: 10.1093/bioinformatics/btu679. Epub 2014 Oct 16.

A probabilistic approach for automated discovery of perturbed genes using expression data from microarray or RNA-Seq.

Comput Biol Med. 2015 Dec 1;67:29-40. doi: 10.1016/j.compbiomed.2015.07.029. Epub 2015 Aug 14.

Identification of Differentially Expressed Genes to Establish New Biomarker for Cancer Prediction.

IEEE/ACM Trans Comput Biol Bioinform. 2019 Nov-Dec;16(6):1970-1985. doi: 10.1109/TCBB.2018.2837095. Epub 2018 May 16.

A novel model used to detect differential splice junctions as biomarkers in prostate cancer from RNA-Seq data.

J Biomed Inform. 2016 Apr;60:422-30. doi: 10.1016/j.jbi.2016.03.010. Epub 2016 Mar 15.

In silico microdissection of microarray data from heterogeneous cell populations.

BMC Bioinformatics. 2005 Mar 14;6:54. doi: 10.1186/1471-2105-6-54.

A new method for constructing tumor specific gene co-expression networks based on samples with tumor purity heterogeneity.

Bioinformatics. 2018 Jul 1;34(13):i528-i536. doi: 10.1093/bioinformatics/bty280.

Biomarker identification and cancer classification based on microarray data using Laplace naive Bayes model with mean shrinkage.

IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1649-62. doi: 10.1109/TCBB.2012.105.

引用本文的文献

Role of Proteins in Oncology: Advances in Cancer Diagnosis, Prognosis, and Targeted Therapy-A Narrative Review.

J Clin Med. 2024 Nov 25;13(23):7131. doi: 10.3390/jcm13237131.

本文引用的文献

EGFR and Ras regulate DDX59 during lung cancer development.

Gene. 2018 Feb 5;642:95-102. doi: 10.1016/j.gene.2017.11.029. Epub 2017 Nov 10.

Protein phosphatase 1 regulatory subunit 1A in ewing sarcoma tumorigenesis and metastasis.

Oncogene. 2018 Feb 8;37(6):798-809. doi: 10.1038/onc.2017.378. Epub 2017 Oct 23.

TRIB1 promotes colorectal cancer cell migration and invasion through activation MMP-2 via FAK/Src and ERK pathways.

Oncotarget. 2017 Jul 18;8(29):47931-47942. doi: 10.18632/oncotarget.18201.

Recurrently deregulated lncRNAs in hepatocellular carcinoma.

Nat Commun. 2017 Feb 13;8:14421. doi: 10.1038/ncomms14421.

Assigning clinical meaning to somatic and germ-line whole-exome sequencing data in a prospective cancer precision medicine study.

Genet Med. 2017 Jul;19(7):787-795. doi: 10.1038/gim.2016.191. Epub 2017 Jan 26.

DDX59 promotes DNA replication in lung adenocarcinoma.

Cell Death Discov. 2017 Jan 9;3:16095. doi: 10.1038/cddiscovery.2016.95. eCollection 2017.

Bayesian Network Inference Modeling Identifies TRIB1 as a Novel Regulator of Cell-Cycle Progression and Survival in Cancer Cells.

Cancer Res. 2017 Apr 1;77(7):1575-1585. doi: 10.1158/0008-5472.CAN-16-0512. Epub 2017 Jan 13.

MMS19 as a potential predictive marker of adjuvant chemotherapy benefit in resected non-small cell lung cancer.

Cancer Biomark. 2016 Sep 26;17(3):323-333. doi: 10.3233/CBM-160644.

The natural compound sulforaphene, as a novel anticancer reagent, targeting PI3K-AKT signaling pathway in lung cancer.

Oncotarget. 2016 Nov 22;7(47):76656-76666. doi: 10.18632/oncotarget.12307.

Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing.

Proc Natl Acad Sci U S A. 2016 Sep 13;113(37):E5528-37. doi: 10.1073/pnas.1522203113. Epub 2016 Aug 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

自适应捕获癌症生物标志物识别中的表达异质性。

Adaptively capturing the heterogeneity of expression for cancer biomarker identification.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

SOFTWARE AVAILABILITY

背景

结果

结论

软件可用性

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献