• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 LR-RF 的乳腺癌差异表达基因筛选的高效混合模型

An Efficient Mixed-Model for Screening Differentially Expressed Genes of Breast Cancer Based on LR-RF.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2019 Jan-Feb;16(1):124-130. doi: 10.1109/TCBB.2018.2829519. Epub 2018 Apr 23.

DOI:10.1109/TCBB.2018.2829519
PMID:29993693
Abstract

To screen differentially expressed genes quickly and efficiently in breast cancer, two gene microarray datasets of breast cancer, GSE15852 and GSE45255, were downloaded from GEO. By combining the Logistic Regression and Random Forest algorithm, this paper proposed a novel method named LR-RF to select differentially expressed genes of breast cancer on microarray data by the Bonferroni test of FWER error measure. Comparing with Logistic Regression and Random Forest, our study shows that LR-FR has a great facility in selecting differentially expressed genes. The average prediction accuracy of the proposed LR-RF from replicating random test 10 times surprisingly reaches 93.11 percent with variance as low as 0.00045. The prediction accuracy rate reaches a maximum 95.57 percent when threshold value α = 0.2 in the random forest algorithm process of ranking genes' importance score, and the differentially expressed genes are relatively few in number. In addition, through analyzing the gene interaction networks, most of the top 20 genes we selected were found to involve in the development of breast cancer. All of these results demonstrate the reliability and efficiency of LR-RF. It is anticipated that LR-RF would provide new knowledge and method for biologists, medical scientists, and cognitive computing researchers to identify disease-related genes of breast cancer.

摘要

为了快速有效地筛选乳腺癌中的差异表达基因,本研究从 GEO 下载了两个乳腺癌基因芯片数据集 GSE15852 和 GSE45255。通过结合 Logistic 回归和随机森林算法,本文提出了一种名为 LR-RF 的新方法,该方法通过 FWER 错误度量的 Bonferroni 检验来选择基因芯片数据中的乳腺癌差异表达基因。与 Logistic 回归和随机森林相比,我们的研究表明 LR-FR 在选择差异表达基因方面具有很大的优势。从重复随机测试 10 次中得出的建议 LR-RF 的平均预测准确率令人惊讶地达到了 93.11%,方差低至 0.00045。当随机森林算法中基因重要性评分排序的阈值α=0.2 时,预测准确率达到最大值 95.57%,并且差异表达基因的数量相对较少。此外,通过分析基因相互作用网络,我们发现所选择的前 20 个基因中的大多数都与乳腺癌的发展有关。所有这些结果都证明了 LR-RF 的可靠性和效率。预计 LR-RF 将为生物学家、医学科学家和认知计算研究人员提供识别乳腺癌相关基因的新知识和方法。

相似文献

1
An Efficient Mixed-Model for Screening Differentially Expressed Genes of Breast Cancer Based on LR-RF.基于 LR-RF 的乳腺癌差异表达基因筛选的高效混合模型
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jan-Feb;16(1):124-130. doi: 10.1109/TCBB.2018.2829519. Epub 2018 Apr 23.
2
Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees.采用逻辑回归、人工神经网络和决策树的 pooled cDNA 微阵列分析进行乳腺癌生存能力的基因表达谱分析。
BMC Bioinformatics. 2013 Mar 19;14:100. doi: 10.1186/1471-2105-14-100.
3
High-efficient Screening Method for Identification of Key Genes in Breast Cancer Through Microarray and Bioinformatics.基于微阵列和生物信息学的乳腺癌关键基因高效筛选方法
Anticancer Res. 2017 Aug;37(8):4329-4335. doi: 10.21873/anticanres.11826.
4
Analysis of the microarray gene expression for breast cancer progression after the application modified logistic regression.应用修正的逻辑回归分析乳腺癌进展的基因表达微阵列。
Gene. 2020 Feb 5;726:144168. doi: 10.1016/j.gene.2019.144168. Epub 2019 Nov 21.
5
GSEA-SDBE: A gene selection method for breast cancer classification based on GSEA and analyzing differences in performance metrics.GSEA-SDBE:一种基于基因集富集分析(GSEA)并分析性能指标差异的乳腺癌分类基因选择方法。
PLoS One. 2022 Apr 26;17(4):e0263171. doi: 10.1371/journal.pone.0263171. eCollection 2022.
6
[Differentially expressed genes and potential signaling pathway in Asian people with breast cancer by preliminary analysis of a large sample of the microarray data].
Nan Fang Yi Ke Da Xue Xue Bao. 2014 Jun;34(6):807-12.
7
Construction of breast cancer gene regulatory networks and drug target optimization.乳腺癌基因调控网络的构建与药物靶点优化。
Arch Gynecol Obstet. 2014 Oct;290(4):749-55. doi: 10.1007/s00404-014-3264-y. Epub 2014 Jun 3.
8
A multicenter random forest model for effective prognosis prediction in collaborative clinical research network.多中心随机森林模型在协作临床研究网络中的有效预后预测。
Artif Intell Med. 2020 Mar;103:101814. doi: 10.1016/j.artmed.2020.101814. Epub 2020 Feb 5.
9
Identifying differentially expressed genes in cancer patients using a non-parameter Ising model.使用非参数伊辛模型鉴定癌症患者中的差异表达基因。
Proteomics. 2011 Oct;11(19):3845-52. doi: 10.1002/pmic.201100180. Epub 2011 Aug 23.
10
[Identification of the differentially expressed genes between primary breast cancer and paired lymph node metastasis through combining mRNA differential display and gene microarray].通过结合mRNA差异显示和基因芯片技术鉴定原发性乳腺癌与配对淋巴结转移之间的差异表达基因
Zhonghua Yi Xue Za Zhi. 2006 Oct 24;86(39):2749-55.

引用本文的文献

1
Algorithm for analyzing randomness in point patterns.点模式随机性分析算法。
MethodsX. 2025 May 8;14:103360. doi: 10.1016/j.mex.2025.103360. eCollection 2025 Jun.
2
Accurate breast cancer diagnosis using a stable feature ranking algorithm.使用稳定特征排序算法进行准确的乳腺癌诊断。
BMC Med Inform Decis Mak. 2023 Apr 6;23(1):64. doi: 10.1186/s12911-023-02142-2.
3
Extracting predictors for lung adenocarcinoma based on Granger causality test and stepwise character selection.基于格兰杰因果检验和逐步特征选择提取肺腺癌预测因子。
BMC Bioinformatics. 2019 May 1;20(Suppl 7):197. doi: 10.1186/s12859-019-2739-z.