• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用 SVM-RFE 预测拟南芥中的耐旱基因。

Prediction of drought-resistant genes in Arabidopsis thaliana using SVM-RFE.

机构信息

Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China.

出版信息

PLoS One. 2011;6(7):e21750. doi: 10.1371/journal.pone.0021750. Epub 2011 Jul 15.

DOI:10.1371/journal.pone.0021750
PMID:21789178
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3137602/
Abstract

BACKGROUND

Identifying genes with essential roles in resisting environmental stress rates high in agronomic importance. Although massive DNA microarray gene expression data have been generated for plants, current computational approaches underutilize these data for studying genotype-trait relationships. Some advanced gene identification methods have been explored for human diseases, but typically these methods have not been converted into publicly available software tools and cannot be applied to plants for identifying genes with agronomic traits.

METHODOLOGY

In this study, we used 22 sets of Arabidopsis thaliana gene expression data from GEO to predict the key genes involved in water tolerance. We applied an SVM-RFE (Support Vector Machine-Recursive Feature Elimination) feature selection method for the prediction. To address small sample sizes, we developed a modified approach for SVM-RFE by using bootstrapping and leave-one-out cross-validation. We also expanded our study to predict genes involved in water susceptibility.

CONCLUSIONS

We analyzed the top 10 genes predicted to be involved in water tolerance. Seven of them are connected to known biological processes in drought resistance. We also analyzed the top 100 genes in terms of their biological functions. Our study shows that the SVM-RFE method is a highly promising method in analyzing plant microarray data for studying genotype-phenotype relationships. The software is freely available with source code at http://ccst.jlu.edu.cn/JCSB/RFET/.

摘要

背景

鉴定在抵御环境压力方面具有重要作用的基因在农业学中具有很高的重要性。尽管已经为植物生成了大量的 DNA 微阵列基因表达数据,但当前的计算方法在研究基因型-表型关系时并未充分利用这些数据。一些先进的基因鉴定方法已被探索用于人类疾病,但通常这些方法尚未转换为公共可用的软件工具,也无法应用于植物以鉴定具有农艺性状的基因。

方法

在这项研究中,我们使用了 22 组来自 GEO 的拟南芥基因表达数据来预测参与水分耐受性的关键基因。我们应用了 SVM-RFE(支持向量机-递归特征消除)特征选择方法进行预测。为了解决小样本量的问题,我们通过使用引导和留一法交叉验证开发了一种 SVM-RFE 的改进方法。我们还扩展了我们的研究,以预测参与水分敏感性的基因。

结论

我们分析了预测参与水分耐受性的前 10 个基因。其中有 7 个与干旱抗性的已知生物学过程有关。我们还分析了前 100 个基因的生物学功能。我们的研究表明,SVM-RFE 方法是分析植物微阵列数据以研究基因型-表型关系的一种非常有前途的方法。该软件可在 http://ccst.jlu.edu.cn/JCSB/RFET/ 上免费获得,包括源代码。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/56ae/3137602/e48f3788bf7c/pone.0021750.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/56ae/3137602/74df3a10e5a4/pone.0021750.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/56ae/3137602/1e2277fa9da0/pone.0021750.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/56ae/3137602/e48f3788bf7c/pone.0021750.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/56ae/3137602/74df3a10e5a4/pone.0021750.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/56ae/3137602/1e2277fa9da0/pone.0021750.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/56ae/3137602/e48f3788bf7c/pone.0021750.g003.jpg

相似文献

1
Prediction of drought-resistant genes in Arabidopsis thaliana using SVM-RFE.利用 SVM-RFE 预测拟南芥中的耐旱基因。
PLoS One. 2011;6(7):e21750. doi: 10.1371/journal.pone.0021750. Epub 2011 Jul 15.
2
Prediction of plant pre-microRNAs and their microRNAs in genome-scale sequences using structure-sequence features and support vector machine.利用结构序列特征和支持向量机在基因组规模序列中预测植物前体微小RNA及其微小RNA
BMC Bioinformatics. 2014 Dec 30;15(1):423. doi: 10.1186/s12859-014-0423-x.
3
Recursive gene selection based on maximum margin criterion: a comparison with SVM-RFE.基于最大间隔准则的递归基因选择:与支持向量机递归特征消除法的比较
BMC Bioinformatics. 2006 Dec 25;7:543. doi: 10.1186/1471-2105-7-543.
4
Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics.基于 SVM-RFE 和重叠率选择特征子集及其在生物信息学中的应用。
Molecules. 2017 Dec 26;23(1):52. doi: 10.3390/molecules23010052.
5
Improving the performance of SVM-RFE to select genes in microarray data.提高 SVM-RFE 在微阵列数据中选择基因的性能。
BMC Bioinformatics. 2006 Sep 6;7 Suppl 2(Suppl 2):S12. doi: 10.1186/1471-2105-7-S2-S12.
6
Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis.用于微阵列表达数据分析的两阶段支持向量机-递归特征消除基因选择策略的开发。
IEEE/ACM Trans Comput Biol Bioinform. 2007 Jul-Sep;4(3):365-81. doi: 10.1109/TCBB.2007.70224.
7
An Efficient Feature Selection Strategy Based on Multiple Support Vector Machine Technology with Gene Expression Data.基于基因表达数据的多支持向量机技术的高效特征选择策略。
Biomed Res Int. 2018 Aug 30;2018:7538204. doi: 10.1155/2018/7538204. eCollection 2018.
8
Ensemble Feature Learning of Genomic Data Using Support Vector Machine.使用支持向量机的基因组数据集成特征学习
PLoS One. 2016 Jun 15;11(6):e0157330. doi: 10.1371/journal.pone.0157330. eCollection 2016.
9
Improving the computational efficiency of recursive cluster elimination for gene selection.提高递归聚类消除基因选择的计算效率。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):122-9. doi: 10.1109/TCBB.2010.44.
10
MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data.MSVM-RFE:用于DNA微阵列数据多类基因选择的SVM-RFE扩展方法
Bioinformatics. 2007 May 1;23(9):1106-14. doi: 10.1093/bioinformatics/btm036.

引用本文的文献

1
Integrative machine learning and RT-qPCR analysis identify key stress-responsive genes in Thermus thermophilus HB8.整合机器学习和逆转录定量聚合酶链反应分析鉴定嗜热栖热菌HB8中的关键应激反应基因。
Genetica. 2025 Aug 20;153(1):28. doi: 10.1007/s10709-025-00243-6.
2
Machine learning reveals distinct gene expression signatures across tissue states in stony coral tissue loss disease.机器学习揭示了石珊瑚组织损失病不同组织状态下独特的基因表达特征。
R Soc Open Sci. 2025 Jul 23;12(7):241993. doi: 10.1098/rsos.241993. eCollection 2025 Jul.
3
scPanel: a tool for automatic identification of sparse gene panels for generalizable patient classification using scRNA-seq datasets.

本文引用的文献

1
Osmotic stress changes carbohydrate partitioning and fructose-2,6-bisphosphate metabolism in barley leaves.渗透胁迫改变大麦叶片中碳水化合物的分配以及果糖-2,6-二磷酸的代谢。
Funct Plant Biol. 2005 Nov;32(11):1033-1043. doi: 10.1071/FP05102.
2
Integration of pathway knowledge into a reweighted recursive feature elimination approach for risk stratification of cancer patients.将通路知识整合到重新加权递归特征消除方法中,用于癌症患者的风险分层。
Bioinformatics. 2010 Sep 1;26(17):2136-44. doi: 10.1093/bioinformatics/btq345. Epub 2010 Jun 30.
3
Arabidopsis tiling array analysis to identify the stress-responsive genes.
scPanel:一种使用 scRNA-seq 数据集进行通用患者分类的自动识别稀疏基因面板的工具。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae482.
4
Novel candidate genes for environmental stresses response in Synechocystis sp. PCC 6803 revealed by machine learning algorithms.利用机器学习算法揭示集胞藻 PCC 6803 中环境胁迫反应的新候选基因。
Braz J Microbiol. 2024 Jun;55(2):1219-1229. doi: 10.1007/s42770-024-01338-6. Epub 2024 May 6.
5
A review of artificial intelligence-assisted omics techniques in plant defense: current trends and future directions.植物防御中人工智能辅助组学技术综述:当前趋势与未来方向
Front Plant Sci. 2024 Mar 5;15:1292054. doi: 10.3389/fpls.2024.1292054. eCollection 2024.
6
Integration of meta-analysis, machine learning and systems biology approach for investigating the transcriptomic response to drought stress in Populus species.综合荟萃分析、机器学习和系统生物学方法研究杨树物种对干旱胁迫的转录组响应。
Sci Rep. 2023 Jan 16;13(1):847. doi: 10.1038/s41598-023-27746-6.
7
Co-expression Gene Networks and Machine-learning Algorithms Unveil a Core Genetic Toolkit for Reproductive Division of Labour in Rudimentary Insect Societies.共表达基因网络和机器学习算法揭示了原始昆虫社会中生殖分工的核心遗传工具包。
Genome Biol Evol. 2023 Jan 4;15(1). doi: 10.1093/gbe/evac174.
8
Modern Approaches for Transcriptome Analyses in Plants.现代植物转录组分析方法。
Adv Exp Med Biol. 2021;1346:11-50. doi: 10.1007/978-3-030-80352-0_2.
9
Gene Correlation Guided Gene Selection for Microarray Data Classification.基于基因相关性的基因选择在基因芯片数据分析分类中的应用。
Biomed Res Int. 2021 Aug 14;2021:6490118. doi: 10.1155/2021/6490118. eCollection 2021.
10
Statistical Approach for Biologically Relevant Gene Selection from High-Throughput Gene Expression Data.从高通量基因表达数据中选择生物学相关基因的统计方法
Entropy (Basel). 2020 Oct 25;22(11):1205. doi: 10.3390/e22111205.
利用拟南芥全基因组芯片分析来鉴定胁迫响应基因。
Methods Mol Biol. 2010;639:141-55. doi: 10.1007/978-1-60761-702-0_8.
4
A study of health effects of long-distance ocean voyages on seamen using a data classification approach.采用数据分类方法研究船员长途航海对健康的影响。
BMC Med Inform Decis Mak. 2010 Mar 10;10:13. doi: 10.1186/1472-6947-10-13.
5
Classification and biomarker identification using gene network modules and support vector machines.基于基因网络模块和支持向量机的分类和生物标志物识别。
BMC Bioinformatics. 2009 Oct 15;10:337. doi: 10.1186/1471-2105-10-337.
6
Transgenic Arabidopsis plants expressing the type 1 inositol 5-phosphatase exhibit increased drought tolerance and altered abscisic acid signaling.表达1型肌醇5-磷酸酶的转基因拟南芥植物表现出增强的耐旱性和改变的脱落酸信号传导。
Plant Cell. 2008 Oct;20(10):2876-93. doi: 10.1105/tpc.108.061374. Epub 2008 Oct 10.
7
Tuning gene expression to changing environments: from rapid responses to evolutionary adaptation.调整基因表达以适应不断变化的环境:从快速反应到进化适应。
Nat Rev Genet. 2008 Aug;9(8):583-93. doi: 10.1038/nrg2398.
8
The relationship of drought-related gene expression in Arabidopsis thaliana to hormonal and environmental factors.拟南芥中干旱相关基因表达与激素和环境因素的关系。
J Exp Bot. 2008;59(11):2991-3007. doi: 10.1093/jxb/ern155. Epub 2008 Jun 13.
9
Enhanced tolerance to drought stress in transgenic tobacco plants overexpressing VTE1 for increased tocopherol production from Arabidopsis thaliana.通过过量表达来自拟南芥的VTE1以增加生育酚产量,转基因烟草植株对干旱胁迫的耐受性增强。
Biotechnol Lett. 2008 Jul;30(7):1275-80. doi: 10.1007/s10529-008-9672-y. Epub 2008 Mar 4.
10
Ectopic expression of Expansin3 or Expansinbeta1 causes enhanced hormone and salt stress sensitivity in Arabidopsis.扩张蛋白3或扩张蛋白β1的异位表达导致拟南芥对激素和盐胁迫的敏感性增强。
Biotechnol Lett. 2008 Jul;30(7):1281-8. doi: 10.1007/s10529-008-9678-5. Epub 2008 Mar 4.