• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从多类蚯蚓微阵列数据集识别和优化分类器基因。

Identification and optimization of classifier genes from multi-class earthworm microarray dataset.

机构信息

School of Computing, University of Southern Mississippi, Hattiesburg, Mississippi, United States of America.

出版信息

PLoS One. 2010 Oct 28;5(10):e13715. doi: 10.1371/journal.pone.0013715.

DOI:10.1371/journal.pone.0013715
PMID:21060837
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2965664/
Abstract

Monitoring, assessment and prediction of environmental risks that chemicals pose demand rapid and accurate diagnostic assays. A variety of toxicological effects have been associated with explosive compounds TNT and RDX. One important goal of microarray experiments is to discover novel biomarkers for toxicity evaluation. We have developed an earthworm microarray containing 15,208 unique oligo probes and have used it to profile gene expression in 248 earthworms exposed to TNT, RDX or neither. We assembled a new machine learning pipeline consisting of several well-established feature filtering/selection and classification techniques to analyze the 248-array dataset in order to construct classifier models that can separate earthworm samples into three groups: control, TNT-treated, and RDX-treated. First, a total of 869 genes differentially expressed in response to TNT or RDX exposure were identified using a univariate statistical algorithm of class comparison. Then, decision tree-based algorithms were applied to select a subset of 354 classifier genes, which were ranked by their overall weight of significance. A multiclass support vector machine (MC-SVM) method and an unsupervised K-mean clustering method were applied to independently refine the classifier, producing a smaller subset of 39 and 30 classifier genes, separately, with 11 common genes being potential biomarkers. The combined 58 genes were considered the refined subset and used to build MC-SVM and clustering models with classification accuracy of 83.5% and 56.9%, respectively. This study demonstrates that the machine learning approach can be used to identify and optimize a small subset of classifier/biomarker genes from high dimensional datasets and generate classification models of acceptable precision for multiple classes.

摘要

监测、评估和预测化学品所带来的环境风险需要快速而准确的诊断检测方法。TNT 和 RDX 等爆炸物与多种毒理效应有关。微阵列实验的一个重要目标是发现用于毒性评估的新型生物标志物。我们开发了一种含有 15208 个独特寡核苷酸探针的蚯蚓微阵列,并将其用于研究 248 条暴露于 TNT、RDX 或两者均不暴露的蚯蚓的基因表达谱。我们开发了一个新的机器学习管道,其中包含几种成熟的特征过滤/选择和分类技术,用于分析 248 个阵列数据集,以构建可以将蚯蚓样本分为三组的分类器模型:对照组、TNT 处理组和 RDX 处理组。首先,使用类比较的单变量统计算法鉴定了 869 个对 TNT 或 RDX 暴露有差异表达的基因。然后,应用基于决策树的算法选择了 354 个分类器基因的子集,这些基因按其整体重要性权重进行排序。应用多类支持向量机 (MC-SVM) 方法和无监督 K-均值聚类方法分别对分类器进行了优化,分别产生了 39 个和 30 个分类器基因的较小子集,其中 11 个共同基因是潜在的生物标志物。将这 58 个组合基因视为经过优化的子集,用于构建 MC-SVM 和聚类模型,其分类准确率分别为 83.5%和 56.9%。本研究表明,机器学习方法可用于从高维数据集中识别和优化分类器/生物标志物基因的小子集,并生成具有可接受精度的多类分类模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec53/2965664/99b8d371bcd2/pone.0013715.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec53/2965664/bfdcd56b0aa0/pone.0013715.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec53/2965664/7c1c75fd3ca7/pone.0013715.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec53/2965664/9c1a76311ca6/pone.0013715.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec53/2965664/6c66e0cbbf7a/pone.0013715.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec53/2965664/99b8d371bcd2/pone.0013715.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec53/2965664/bfdcd56b0aa0/pone.0013715.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec53/2965664/7c1c75fd3ca7/pone.0013715.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec53/2965664/9c1a76311ca6/pone.0013715.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec53/2965664/6c66e0cbbf7a/pone.0013715.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec53/2965664/99b8d371bcd2/pone.0013715.g005.jpg

相似文献

1
Identification and optimization of classifier genes from multi-class earthworm microarray dataset.从多类蚯蚓微阵列数据集识别和优化分类器基因。
PLoS One. 2010 Oct 28;5(10):e13715. doi: 10.1371/journal.pone.0013715.
2
Transcriptomic analysis of RDX and TNT interactive sublethal effects in the earthworm Eisenia fetida.蚯蚓 Eisenia fetida 中黑索今(RDX)和三硝基甲苯(TNT)交互亚致死效应的转录组学分析
BMC Genomics. 2008;9 Suppl 1(Suppl 1):S15. doi: 10.1186/1471-2164-9-S1-S15.
3
Building quantitative prediction models for tissue residue of two explosives compounds in earthworms from microarray gene expression data.基于基因表达数据构建微阵列定量预测两种爆炸物在蚯蚓组织残留的模型。
Environ Sci Technol. 2012 Jan 3;46(1):19-26. doi: 10.1021/es201187u. Epub 2011 Aug 8.
4
Identification of risk genes associated with myocardial infarction based on the recursive feature elimination algorithm and support vector machine classifier.基于递归特征消除算法和支持向量机分类器的心肌梗死相关风险基因鉴定。
Mol Med Rep. 2018 Jan;17(1):1555-1560. doi: 10.3892/mmr.2017.8044. Epub 2017 Nov 14.
5
An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples.一种基于多重过滤和监督属性聚类算法的集成机器学习模型,用于对癌症样本进行分类。
PeerJ Comput Sci. 2021 Sep 16;7:e671. doi: 10.7717/peerj-cs.671. eCollection 2021.
6
Incorporating EBO-HSIC with SVM for Gene Selection Associated with Cervical Cancer Classification.将 EBO-HSIC 与 SVM 相结合,用于选择与宫颈癌分类相关的基因。
J Med Syst. 2018 Oct 6;42(11):225. doi: 10.1007/s10916-018-1092-5.
7
A comparative study of different machine learning methods on microarray gene expression data.不同机器学习方法对微阵列基因表达数据的比较研究。
BMC Genomics. 2008;9 Suppl 1(Suppl 1):S13. doi: 10.1186/1471-2164-9-S1-S13.
8
Construction of a 26‑feature gene support vector machine classifier for smoking and non‑smoking lung adenocarcinoma sample classification.构建一个 26 特征基因支持向量机分类器,用于吸烟和非吸烟肺腺癌样本分类。
Mol Med Rep. 2018 Feb;17(2):3005-3013. doi: 10.3892/mmr.2017.8220. Epub 2017 Dec 7.
9
Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine.使用混合优化算法和模糊支持向量机改进癌症分类并从基因表达谱中挖掘生物标志物
J Med Signals Sens. 2018 Jan-Mar;8(1):1-11.
10
An Efficient Feature Selection Strategy Based on Multiple Support Vector Machine Technology with Gene Expression Data.基于基因表达数据的多支持向量机技术的高效特征选择策略。
Biomed Res Int. 2018 Aug 30;2018:7538204. doi: 10.1155/2018/7538204. eCollection 2018.

引用本文的文献

1
Alterations in common marmoset gut microbiome associated with duodenal strictures.与十二指肠狭窄相关的普通绒猴肠道微生物组的改变。
Sci Rep. 2022 Mar 28;12(1):5277. doi: 10.1038/s41598-022-09268-9.
2
-ary Rank Classifier Combination: A Binary Linear Programming Problem.-ary 秩分类器组合:一个二元线性规划问题。
Entropy (Basel). 2019 Apr 26;21(5):440. doi: 10.3390/e21050440.
3
Multigene Biomarkers of Pyrethroid Exposure: Exploratory Experiments.多基因生物标志物在拟除虫菊酯暴露中的应用:探索性实验。

本文引用的文献

1
Design, validation and annotation of transcriptome-wide oligonucleotide probes for the oligochaete annelid Eisenia fetida.用于寡毛类环节动物赤子爱胜蚓的转录组-wide 寡核苷酸探针的设计、验证和注释。
PLoS One. 2010 Dec 8;5(12):e14266. doi: 10.1371/journal.pone.0014266.
2
Gene expression analysis reveals a gene set discriminatory to different metals in soil.基因表达分析揭示了一组可区分土壤中不同金属的基因。
Toxicol Sci. 2010 May;115(1):34-40. doi: 10.1093/toxsci/kfq043. Epub 2010 Feb 4.
3
Robust biomarker identification for cancer diagnosis with ensemble feature selection methods.
Environ Toxicol Chem. 2019 Nov;38(11):2436-2446. doi: 10.1002/etc.4552. Epub 2019 Oct 3.
4
Comparative toxicogenomics of three insensitive munitions constituents 2,4-dinitroanisole, nitroguanidine and nitrotriazolone in the soil nematode Caenorhabditis elegans.三种不敏感弹药成分2,4-二硝基苯甲醚、硝基胍和硝基三唑酮对秀丽隐杆线虫的比较毒理基因组学研究
BMC Syst Biol. 2018 Dec 14;12(Suppl 7):92. doi: 10.1186/s12918-018-0636-0.
5
A statistical framework for applying RNA profiling to chemical hazard detection.一种将RNA分析应用于化学危害检测的统计框架。
Chemosphere. 2017 Dec;188:49-59. doi: 10.1016/j.chemosphere.2017.08.136. Epub 2017 Aug 28.
6
Differences of immune disorders between Alzheimer's disease and breast cancer based on transcriptional regulation.基于转录调控的阿尔茨海默病与乳腺癌免疫紊乱差异
PLoS One. 2017 Jul 18;12(7):e0180337. doi: 10.1371/journal.pone.0180337. eCollection 2017.
7
Predicting chemical bioavailability using microarray gene expression data and regression modeling: A tale of three explosive compounds.利用微阵列基因表达数据和回归模型预测化学生物利用度:三种爆炸性化合物的故事。
BMC Genomics. 2016 Mar 8;17:205. doi: 10.1186/s12864-016-2541-5.
8
"Eco-omics": a review of the application of genomics, transcriptomics, and proteomics for the study of the ecology of harmful algae.“生态经济学”:基因组学、转录组学和蛋白质组学在有害藻类生态学研究中的应用综述。
Microb Ecol. 2013 May;65(4):901-15. doi: 10.1007/s00248-013-0220-5. Epub 2013 Apr 4.
9
Conserved toxic responses across divergent phylogenetic lineages: a meta-analysis of the neurotoxic effects of RDX among multiple species using toxicogenomics.跨不同进化谱系的保守毒性反应:利用毒理基因组学对多种物种中 RDX 的神经毒性效应进行的荟萃分析。
Ecotoxicology. 2011 May;20(3):580-94. doi: 10.1007/s10646-011-0623-3. Epub 2011 Mar 29.
10
Design, validation and annotation of transcriptome-wide oligonucleotide probes for the oligochaete annelid Eisenia fetida.用于寡毛类环节动物赤子爱胜蚓的转录组-wide 寡核苷酸探针的设计、验证和注释。
PLoS One. 2010 Dec 8;5(12):e14266. doi: 10.1371/journal.pone.0014266.
基于集成特征选择方法的癌症诊断稳健生物标志物识别。
Bioinformatics. 2010 Feb 1;26(3):392-8. doi: 10.1093/bioinformatics/btp630. Epub 2009 Nov 25.
4
Changed profile of splicing regulator genes expression in response to exercise.运动后剪接调节基因表达谱的变化。
Bull Exp Biol Med. 2009 Jun;147(6):733-6. doi: 10.1007/s10517-009-0593-0.
5
The RNA-binding protein KSRP promotes the biogenesis of a subset of microRNAs.RNA结合蛋白KSRP促进了一部分微小RNA的生物合成。
Nature. 2009 Jun 18;459(7249):1010-4. doi: 10.1038/nature08025. Epub 2009 May 20.
6
Analysis of gene expression data using BRB-ArrayTools.使用BRB-ArrayTools分析基因表达数据。
Cancer Inform. 2007 Feb 4;3:11-7.
7
Multiclass cancer classification by support vector machines with class-wise optimized genes and probability estimates.基于类别优化基因和概率估计的支持向量机进行多类别癌症分类
J Theor Biol. 2009 Aug 7;259(3):533-40. doi: 10.1016/j.jtbi.2009.04.013. Epub 2009 May 3.
8
An integrated method for cancer classification and rule extraction from microarray data.一种从微阵列数据中进行癌症分类和规则提取的综合方法。
J Biomed Sci. 2009 Feb 24;16(1):25. doi: 10.1186/1423-0127-16-25.
9
Target discovery from data mining approaches.基于数据挖掘方法的靶点发现
Drug Discov Today. 2009 Feb;14(3-4):147-54. doi: 10.1016/j.drudis.2008.12.005. Epub 2009 Jan 20.
10
Hepatic transcriptomic profiles of European flounder (Platichthys flesus) from field sites and computational approaches to predict site from stress gene responses following exposure to model toxicants.来自野外场地的欧洲比目鱼(欧洲鲽)的肝脏转录组图谱以及通过计算方法根据暴露于模型毒物后的应激基因反应预测场地。
Aquat Toxicol. 2008 Nov 11;90(2):92-101. doi: 10.1016/j.aquatox.2008.07.020. Epub 2008 Aug 19.