• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用具有局部结构的贝叶斯网络从基因表达数据中学习简洁分类规则。

Learning Parsimonious Classification Rules from Gene Expression Data Using Bayesian Networks with Local Structure.

作者信息

Lustgarten Jonathan Lyle, Balasubramanian Jeya Balaji, Visweswaran Shyam, Gopalakrishnan Vanathi

机构信息

Red Bank Veterinary Hospital / 2051 Briggs Rd, Mt Laurel, NJ 08054, USA.

Intelligent Systems Program, University of Pittsburgh / 5113 Sennott Square, 210 South Bouquet Street, Pittsburgh, PA 15260, USA.

出版信息

Data (Basel). 2017 Mar;2(1). doi: 10.3390/data2010005. Epub 2017 Jan 18.

DOI:10.3390/data2010005
PMID:28331847
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5358670/
Abstract

The comprehensibility of good predictive models learned from high-dimensional gene expression data is attractive because it can lead to biomarker discovery. Several good classifiers provide comparable predictive performance but differ in their abilities to summarize the observed data. We extend a Bayesian Rule Learning (BRL-GSS) algorithm, previously shown to be a significantly better predictor than other classical approaches in this domain. It searches a space of Bayesian networks using a decision tree representation of its parameters with global constraints, and infers a set of IF-THEN rules. The number of parameters and therefore the number of rules are combinatorial to the number of predictor variables in the model. We relax these global constraints to a more generalizable local structure (BRL-LSS). BRL-LSS entails more parsimonious set of rules because it does not have to generate all combinatorial rules. The search space of local structures is much richer than the space of global structures. We design the BRL-LSS with the same worst-case time-complexity as BRL-GSS while exploring a richer and more complex model space. We measure predictive performance using Area Under the ROC curve (AUC) and Accuracy. We measure model parsimony performance by noting the average number of rules and variables needed to describe the observed data. We evaluate the predictive and parsimony performance of BRL-GSS, BRL-LSS and the state-of-the-art C4.5 decision tree algorithm, across 10-fold cross-validation using ten microarray gene-expression diagnostic datasets. In these experiments, we observe that BRL-LSS is similar to BRL-GSS in terms of predictive performance, while generating a much more parsimonious set of rules to explain the same observed data. BRL-LSS also needs fewer variables than C4.5 to explain the data with similar predictive performance. We also conduct a feasibility study to demonstrate the general applicability of our BRL methods on the newer RNA sequencing gene-expression data.

摘要

从高维基因表达数据中学习到的良好预测模型的可理解性很有吸引力,因为它可以促成生物标志物的发现。几个性能良好的分类器提供了相当的预测性能,但在总结观测数据的能力方面有所不同。我们扩展了一种贝叶斯规则学习(BRL-GSS)算法,该算法此前已被证明在该领域是比其他经典方法显著更好的预测器。它使用带有全局约束的参数决策树表示来搜索贝叶斯网络空间,并推断出一组“如果……那么……”规则。参数的数量以及因此规则的数量与模型中预测变量的数量是组合关系。我们将这些全局约束放宽到更具通用性的局部结构(BRL-LSS)。BRL-LSS需要的规则集更为简洁,因为它不必生成所有组合规则。局部结构的搜索空间比全局结构的空间丰富得多。我们设计的BRL-LSS与BRL-GSS具有相同的最坏情况时间复杂度,同时探索更丰富、更复杂的模型空间。我们使用ROC曲线下面积(AUC)和准确率来衡量预测性能。我们通过记录描述观测数据所需的规则和变量的平均数量来衡量模型的简洁性性能。我们使用十个微阵列基因表达诊断数据集,通过10折交叉验证来评估BRL-GSS、BRL-LSS和最先进的C4.5决策树算法的预测性能和简洁性性能。在这些实验中,我们观察到BRL-LSS在预测性能方面与BRL-GSS相似,同时生成一组简洁得多的规则来解释相同的观测数据。BRL-LSS在具有相似预测性能的情况下,解释数据所需的变量也比C4.5少。我们还进行了一项可行性研究,以证明我们的BRL方法在更新的RNA测序基因表达数据上的普遍适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a002/5358670/7a67f915ca4b/nihms846819f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a002/5358670/c2ec35adcf9f/nihms846819f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a002/5358670/0233add59868/nihms846819f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a002/5358670/7a67f915ca4b/nihms846819f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a002/5358670/c2ec35adcf9f/nihms846819f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a002/5358670/0233add59868/nihms846819f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a002/5358670/7a67f915ca4b/nihms846819f3.jpg

相似文献

1
Learning Parsimonious Classification Rules from Gene Expression Data Using Bayesian Networks with Local Structure.使用具有局部结构的贝叶斯网络从基因表达数据中学习简洁分类规则。
Data (Basel). 2017 Mar;2(1). doi: 10.3390/data2010005. Epub 2017 Jan 18.
2
Tunable structure priors for Bayesian rule learning for knowledge integrated biomarker discovery.用于知识整合生物标志物发现的贝叶斯规则学习的可调结构先验。
World J Clin Oncol. 2018 Sep 14;9(5):98-109. doi: 10.5306/wjco.v9.i5.98.
3
cMRI-BED: A novel informatics framework for cardiac MRI biomarker extraction and discovery applied to pediatric cardiomyopathy classification.心脏磁共振成像生物标志物提取与发现的新型信息学框架——应用于儿童心肌病分类的cMRI-BED
Biomed Eng Online. 2015;14 Suppl 2(Suppl 2):S7. doi: 10.1186/1475-925X-14-S2-S7. Epub 2015 Aug 13.
4
Bayesian rule learning for biomedical data mining.贝叶斯规则学习在生物医学数据挖掘中的应用。
Bioinformatics. 2010 Mar 1;26(5):668-75. doi: 10.1093/bioinformatics/btq005. Epub 2010 Jan 14.
5
A novel approach to modeling multifactorial diseases using Ensemble Bayesian Rule classifiers.基于集成贝叶斯规则分类器的多因素疾病建模新方法。
J Biomed Inform. 2020 Jul;107:103455. doi: 10.1016/j.jbi.2020.103455. Epub 2020 Jun 1.
6
A tree-like Bayesian structure learning algorithm for small-sample datasets from complex biological model systems.一种用于来自复杂生物模型系统的小样本数据集的树状贝叶斯结构学习算法。
BMC Syst Biol. 2015 Aug 28;9:49. doi: 10.1186/s12918-015-0194-7.
7
Selective model averaging with bayesian rule learning for predictive biomedicine.用于预测性生物医学的贝叶斯规则学习选择性模型平均法。
AMIA Jt Summits Transl Sci Proc. 2014 Apr 7;2014:17-22. eCollection 2014.
8
Evolving Local Plasticity Rules for Synergistic Learning in Echo State Networks.用于协同学习的回声状态网络中局部可塑性规则的演变。
IEEE Trans Neural Netw Learn Syst. 2020 Apr;31(4):1363-1374. doi: 10.1109/TNNLS.2019.2919903. Epub 2019 Jun 24.
9
Machine learning for improved pathological staging of prostate cancer: a performance comparison on a range of classifiers.机器学习在前列腺癌病理分期中的应用:一系列分类器的性能比较。
Artif Intell Med. 2012 May;55(1):25-35. doi: 10.1016/j.artmed.2011.11.003. Epub 2011 Dec 27.
10
LEMRG: Decision Rule Generation Algorithm for Mining MicroRNA Expression Data.LEMRG:用于挖掘微小RNA表达数据的决策规则生成算法
Adv Exp Med Biol. 2017;1028:105-137. doi: 10.1007/978-981-10-6041-0_7.

引用本文的文献

1
Veterinary informatics: forging the future between veterinary medicine, human medicine, and One Health initiatives-a joint paper by the Association for Veterinary Informatics (AVI) and the CTSA One Health Alliance (COHA).兽医信息学:在兽医学、人类医学和“同一个健康”倡议之间开创未来——兽医信息学协会(AVI)和临床与转化科学奖“同一个健康”联盟(COHA)联合撰写的论文
JAMIA Open. 2020 Apr 11;3(2):306-317. doi: 10.1093/jamiaopen/ooaa005. eCollection 2020 Jul.
2
Systematic discovery of the functional impact of somatic genome alterations in individual tumors through tumor-specific causal inference.通过肿瘤特异性因果推断,在个体肿瘤中系统地发现体细胞基因组改变的功能影响。
PLoS Comput Biol. 2019 Jul 5;15(7):e1007088. doi: 10.1371/journal.pcbi.1007088. eCollection 2019 Jul.
3

本文引用的文献

1
CCTop: An Intuitive, Flexible and Reliable CRISPR/Cas9 Target Prediction Tool.CCTop:一款直观、灵活且可靠的CRISPR/Cas9靶点预测工具。
PLoS One. 2015 Apr 24;10(4):e0124633. doi: 10.1371/journal.pone.0124633. eCollection 2015.
2
Selective model averaging with bayesian rule learning for predictive biomedicine.用于预测性生物医学的贝叶斯规则学习选择性模型平均法。
AMIA Jt Summits Transl Sci Proc. 2014 Apr 7;2014:17-22. eCollection 2014.
3
voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.voom:精确权重为RNA测序读数计数解锁线性模型分析工具。
Tunable structure priors for Bayesian rule learning for knowledge integrated biomarker discovery.用于知识整合生物标志物发现的贝叶斯规则学习的可调结构先验。
World J Clin Oncol. 2018 Sep 14;9(5):98-109. doi: 10.5306/wjco.v9.i5.98.
Genome Biol. 2014 Feb 3;15(2):R29. doi: 10.1186/gb-2014-15-2-r29.
4
SNPdryad: predicting deleterious non-synonymous human SNPs using only orthologous protein sequences.SNPdryad:仅使用直系同源蛋白质序列预测有害的非同义人类单核苷酸多态性
Bioinformatics. 2014 Apr 15;30(8):1112-1119. doi: 10.1093/bioinformatics/btt769. Epub 2014 Jan 2.
5
Comprehensive molecular characterization of clear cell renal cell carcinoma.透明细胞肾细胞癌的全面分子特征分析。
Nature. 2013 Jul 4;499(7456):43-9. doi: 10.1038/nature12222. Epub 2013 Jun 23.
6
A comparison of methods for differential expression analysis of RNA-seq data.RNA-seq 数据差异表达分析方法的比较。
BMC Bioinformatics. 2013 Mar 9;14:91. doi: 10.1186/1471-2105-14-91.
7
A multiplexed serum biomarker immunoassay panel discriminates clinical lung cancer patients from high-risk individuals found to be cancer-free by CT screening.一种多重血清生物标志物免疫分析试剂盒能够区分临床肺癌患者与 CT 筛查未发现癌症的高危个体。
J Thorac Oncol. 2012 Apr;7(4):698-708. doi: 10.1097/JTO.0b013e31824ab6b0.
8
Application of an efficient Bayesian discretization method to biomedical data.高效贝叶斯离散化方法在生物医学数据中的应用。
BMC Bioinformatics. 2011 Jul 28;12:309. doi: 10.1186/1471-2105-12-309.
9
Transfer learning of classification rules for biomarker discovery and verification from molecular profiling studies.基于分子谱研究的分类规则的迁移学习在生物标志物发现和验证中的应用。
J Biomed Inform. 2011 Dec;44 Suppl 1(0 1):S17-S23. doi: 10.1016/j.jbi.2011.04.009. Epub 2011 May 6.
10
Discovery and verification of amyotrophic lateral sclerosis biomarkers by proteomics.通过蛋白质组学发现和验证肌萎缩侧索硬化症的生物标志物。
Muscle Nerve. 2010 Jul;42(1):104-11. doi: 10.1002/mus.21683.