从生存数据中学习规则集。

Learning rule sets from survival data.

作者信息

Wróbel Łukasz, Gudyś Adam, Sikora Marek

机构信息

Institute of Informatics, Silesian Univ. of Technology, Akademicka 16, Gliwice, 44-100, Poland.

Institute of Innovative Technologies, EMAG, Leopolda 31, Katowice, 40-189, Poland.

出版信息

BMC Bioinformatics. 2017 May 30;18(1):285. doi: 10.1186/s12859-017-1693-x.

DOI:10.1186/s12859-017-1693-x

PMID:28558674

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5450332/

Abstract

BACKGROUND

Survival analysis is an important element of reasoning from data. Applied in a number of fields, it has become particularly useful in medicine to estimate the survival rate of patients on the basis of their condition, examination results, and undergoing treatment. The recent developments in the next generation sequencing open new opportunities in survival study as they allow vast amount of genome-, transcriptome-, and proteome-related features to be investigated. These include single nucleotide and structural variants, expressions of genes and microRNAs, DNA methylation, and many others.

RESULTS

We present LR-Rules, a new algorithm for rule induction from survival data. It works according to the separate-and-conquer heuristics with a use of log-rank test for establishing rule body. Extensive experiments show LR-Rules to generate models of superior accuracy and comprehensibility. The detailed analysis of rules rendered by the presented algorithm on four medical datasets concerning leukemia as well as breast, lung, and thyroid cancers, reveals the ability to discover true relations between attributes and patients' survival rate. Two of the case studies incorporate features obtained with a use of high throughput technologies showing the usability of the algorithm in the analysis of bioinformatics data.

CONCLUSIONS

LR-Rules is a viable alternative to existing approaches to survival analysis, particularly when the interpretability of a resulting model is crucial. Presented algorithm may be especially useful when applied on the genomic and proteomic data as it may contribute to the better understanding of the background of diseases and support their treatments.

摘要

背景

生存分析是数据推理的一个重要元素。它应用于多个领域，在医学领域尤其有用，可根据患者的病情、检查结果和正在接受的治疗来估计患者的生存率。新一代测序技术的最新发展为生存研究带来了新机遇，因为它们使得大量与基因组、转录组和蛋白质组相关的特征得以研究。这些特征包括单核苷酸和结构变异、基因和微小RNA的表达、DNA甲基化等等。

结果

我们提出了LR - Rules，一种从生存数据中归纳规则的新算法。它依据分治启发式方法工作，使用对数秩检验来建立规则体。大量实验表明LR - Rules能生成准确性和可理解性都更优的模型。对该算法在四个关于白血病以及乳腺癌、肺癌和甲状腺癌的医学数据集上生成的规则进行详细分析，揭示了其发现属性与患者生存率之间真实关系的能力。其中两个案例研究纳入了通过高通量技术获得的特征，展示了该算法在生物信息学数据分析中的实用性。

结论

LR - Rules是现有生存分析方法的一个可行替代方案，特别是当所得模型的可解释性至关重要时。当应用于基因组和蛋白质组数据时，所提出的算法可能特别有用，因为它可能有助于更好地理解疾病背景并支持疾病治疗。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/12ae/5450332/0c86981b1773/12859_2017_1693_Fig1_HTML.jpg

相似文献

Learning rule sets from survival data.从生存数据中学习规则集。

BMC Bioinformatics. 2017 May 30;18(1):285. doi: 10.1186/s12859-017-1693-x.

Censoring weighted separate-and-conquer rule induction from survival data.

Methods Inf Med. 2014;53(2):137-48. doi: 10.3414/ME13-01-0046. Epub 2014 Feb 27.

LEMRG: Decision Rule Generation Algorithm for Mining MicroRNA Expression Data.LEMRG：用于挖掘微小RNA表达数据的决策规则生成算法

Adv Exp Med Biol. 2017;1028:105-137. doi: 10.1007/978-981-10-6041-0_7.

Predicting censored survival data based on the interactions between meta-dimensional omics data in breast cancer.基于乳腺癌元维度组学数据间的相互作用预测删失生存数据。

J Biomed Inform. 2015 Aug;56:220-8. doi: 10.1016/j.jbi.2015.05.019. Epub 2015 Jun 3.

CoNCoS: copy number estimation in cancer with controlled support.CoNCoS：癌症中具有可控支持的拷贝数估计

J Bioinform Comput Biol. 2015 Oct;13(5):1550027. doi: 10.1142/S0219720015500274. Epub 2015 Sep 4.

FGMD: A novel approach for functional gene module detection in cancer.FGMD：一种用于癌症中功能基因模块检测的新方法。

PLoS One. 2017 Dec 15;12(12):e0188900. doi: 10.1371/journal.pone.0188900. eCollection 2017.

Multiple network algorithm for epigenetic modules via the integration of genome-wide DNA methylation and gene expression data.通过整合全基因组DNA甲基化和基因表达数据构建表观遗传模块的多重网络算法

BMC Bioinformatics. 2017 Jan 31;18(1):72. doi: 10.1186/s12859-017-1490-6.

Dynamic association rules for gene expression data analysis.用于基因表达数据分析的动态关联规则

BMC Genomics. 2015 Oct 14;16:786. doi: 10.1186/s12864-015-1970-x.

Chromosome X genomic and epigenomic aberrations and clinical implications in breast cancer by base resolution profiling.基于碱基分辨率分析的X染色体基因组和表观基因组畸变及其在乳腺癌中的临床意义

Epigenomics. 2015 Oct;7(7):1099-110. doi: 10.2217/epi.15.43. Epub 2015 May 18.

A regression model for estimating DNA copy number applied to capture sequencing data.用于估计捕获测序数据中 DNA 拷贝数的回归模型。

Bioinformatics. 2012 Sep 15;28(18):2357-65. doi: 10.1093/bioinformatics/bts448. Epub 2012 Jul 13.

引用本文的文献

Enhancement of anti-sarcoma immunity by NK cells engineered with mRNA for expression of a EphA2-targeted CAR.通过用mRNA工程化以表达靶向EphA2的嵌合抗原受体（CAR）的自然杀伤（NK）细胞增强抗肉瘤免疫力。

Clin Transl Med. 2025 Jan;15(1):e70140. doi: 10.1002/ctm2.70140.

本文引用的文献

Rotation survival forest for right censored data.用于右删失数据的旋转生存森林

PeerJ. 2015 Jun 11;3:e1009. doi: 10.7717/peerj.1009. eCollection 2015.

Integrated genomic characterization of papillary thyroid carcinoma.甲状腺乳头状癌的综合基因组特征分析

Cell. 2014 Oct 23;159(3):676-90. doi: 10.1016/j.cell.2014.09.050.

Rationale and Applications of Survival Tree and Survival Ensemble Methods.生存树和生存集成方法的原理与应用

Psychometrika. 2015 Sep;80(3):811-33. doi: 10.1007/s11336-014-9413-1. Epub 2014 Sep 17.

Censoring weighted separate-and-conquer rule induction from survival data.

Methods Inf Med. 2014;53(2):137-48. doi: 10.3414/ME13-01-0046. Epub 2014 Feb 27.

Improved performance on high-dimensional survival data by application of Survival-SVM.应用 Survival-SVM 提高高维生存数据的性能。

Bioinformatics. 2011 Jan 1;27(1):87-94. doi: 10.1093/bioinformatics/btq617. Epub 2010 Nov 8.

Higher CD34(+) and CD3(+) cell doses in the graft promote long-term survival, and have no impact on the incidence of severe acute or chronic graft-versus-host disease after in vivo T cell-depleted unrelated donor hematopoietic stem cell transplantation in children.移植物中较高的 CD34(+) 和 CD3(+) 细胞剂量可促进长期存活，且在儿童体内 T 细胞耗竭的无关供者造血干细胞移植后，不会影响严重急性或慢性移植物抗宿主病的发生率。

Biol Blood Marrow Transplant. 2010 Oct;16(10):1388-401. doi: 10.1016/j.bbmt.2010.04.001. Epub 2010 Apr 9.

Learning Bayesian networks from survival data using weighting censored instances.使用加权删失实例从生存数据中学习贝叶斯网络。

J Biomed Inform. 2010 Aug;43(4):613-22. doi: 10.1016/j.jbi.2010.03.005. Epub 2010 Mar 21.

Impact of censoring on learning Bayesian networks in survival modelling.生存模型中删失数据对贝叶斯网络学习的影响。

Artif Intell Med. 2009 Nov;47(3):199-217. doi: 10.1016/j.artmed.2009.08.001. Epub 2009 Oct 14.

Boosting for high-dimensional time-to-event data with competing risks.具有竞争风险的高维生存时间数据的增强方法

Bioinformatics. 2009 Apr 1;25(7):890-6. doi: 10.1093/bioinformatics/btp088. Epub 2009 Feb 25.

Logical analysis of survival data: prognostic survival models by detecting high-degree interactions in right-censored data.生存数据的逻辑分析：通过检测右删失数据中的高阶相互作用构建预后生存模型。

Bioinformatics. 2008 Aug 15;24(16):i248-53. doi: 10.1093/bioinformatics/btn265.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

从生存数据中学习规则集。

Learning rule sets from survival data.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献