• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

免疫特征数据分类算法的比较研究。

Comparative study of classification algorithms for immunosignaturing data.

机构信息

Center for Innovations in Medicine, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA.

出版信息

BMC Bioinformatics. 2012 Jun 21;13:139. doi: 10.1186/1471-2105-13-139.

DOI:10.1186/1471-2105-13-139
PMID:22720696
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3430557/
Abstract

BACKGROUND

High-throughput technologies such as DNA, RNA, protein, antibody and peptide microarrays are often used to examine differences across drug treatments, diseases, transgenic animals, and others. Typically one trains a classification system by gathering large amounts of probe-level data, selecting informative features, and classifies test samples using a small number of features. As new microarrays are invented, classification systems that worked well for other array types may not be ideal. Expression microarrays, arguably one of the most prevalent array types, have been used for years to help develop classification algorithms. Many biological assumptions are built into classifiers that were designed for these types of data. One of the more problematic is the assumption of independence, both at the probe level and again at the biological level. Probes for RNA transcripts are designed to bind single transcripts. At the biological level, many genes have dependencies across transcriptional pathways where co-regulation of transcriptional units may make many genes appear as being completely dependent. Thus, algorithms that perform well for gene expression data may not be suitable when other technologies with different binding characteristics exist. The immunosignaturing microarray is based on complex mixtures of antibodies binding to arrays of random sequence peptides. It relies on many-to-many binding of antibodies to the random sequence peptides. Each peptide can bind multiple antibodies and each antibody can bind multiple peptides. This technology has been shown to be highly reproducible and appears promising for diagnosing a variety of disease states. However, it is not clear what is the optimal classification algorithm for analyzing this new type of data.

RESULTS

We characterized several classification algorithms to analyze immunosignaturing data. We selected several datasets that range from easy to difficult to classify, from simple monoclonal binding to complex binding patterns in asthma patients. We then classified the biological samples using 17 different classification algorithms. Using a wide variety of assessment criteria, we found 'Naïve Bayes' far more useful than other widely used methods due to its simplicity, robustness, speed and accuracy.

CONCLUSIONS

'Naïve Bayes' algorithm appears to accommodate the complex patterns hidden within multilayered immunosignaturing microarray data due to its fundamental mathematical properties.

摘要

背景

高通量技术,如 DNA、RNA、蛋白质、抗体和肽微阵列,常用于检测药物治疗、疾病、转基因动物等方面的差异。通常,通过收集大量探针级别的数据,选择有意义的特征,然后使用少量特征对测试样本进行分类,从而训练分类系统。随着新的微阵列的发明,适用于其他阵列类型的分类系统可能并不理想。表达微阵列可以说是最常见的阵列类型之一,多年来一直被用于帮助开发分类算法。许多生物学假设被内置到为这些类型的数据设计的分类器中。其中一个更成问题的假设是在探针水平和生物水平上的独立性假设。RNA 转录物的探针设计用于结合单个转录物。在生物学水平上,许多基因在转录途径中存在依赖性,转录单元的共调控可能使许多基因看起来完全依赖。因此,在存在具有不同结合特性的其他技术时,适用于基因表达数据的算法可能并不适合。免疫印迹微阵列基于与随机序列肽阵列结合的复杂抗体混合物。它依赖于抗体与随机序列肽的多对多结合。每个肽可以结合多个抗体,每个抗体可以结合多个肽。该技术已被证明具有高度可重复性,并且在诊断各种疾病状态方面显示出很大的希望。然而,对于分析这种新型数据,哪种分类算法是最优的还不清楚。

结果

我们对几种分类算法进行了分析免疫印迹数据的特征描述。我们选择了几个数据集,范围从简单的单克隆结合到哮喘患者的复杂结合模式,这些数据集从易于分类到难以分类。然后,我们使用 17 种不同的分类算法对这些生物样本进行了分类。使用广泛的评估标准,我们发现“朴素贝叶斯”由于其简单性、稳健性、速度和准确性,远比其他广泛使用的方法更有用。

结论

“朴素贝叶斯”算法似乎由于其基本的数学性质而适应于多层免疫印迹微阵列数据中隐藏的复杂模式。

相似文献

1
Comparative study of classification algorithms for immunosignaturing data.免疫特征数据分类算法的比较研究。
BMC Bioinformatics. 2012 Jun 21;13:139. doi: 10.1186/1471-2105-13-139.
2
Evaluating machine learning approaches for aiding probe selection for gene-expression arrays.评估用于辅助基因表达阵列探针选择的机器学习方法。
Bioinformatics. 2002;18 Suppl 1:S164-71. doi: 10.1093/bioinformatics/18.suppl_1.s164.
3
A balanced iterative random forest for gene selection from microarray data.一种基于平衡迭代随机森林的微阵列数据基因选择方法。
BMC Bioinformatics. 2013 Aug 27;14:261. doi: 10.1186/1471-2105-14-261.
4
A hierarchical Naïve Bayes Model for handling sample heterogeneity in classification problems: an application to tissue microarrays.一种用于处理分类问题中样本异质性的分层朴素贝叶斯模型:在组织微阵列中的应用。
BMC Bioinformatics. 2006 Nov 24;7:514. doi: 10.1186/1471-2105-7-514.
5
Biomarker identification and cancer classification based on microarray data using Laplace naive Bayes model with mean shrinkage.基于微阵列数据的拉普拉斯朴素贝叶斯模型均值收缩的生物标志物识别和癌症分类。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1649-62. doi: 10.1109/TCBB.2012.105.
6
Feature selection and classification for microarray data analysis: evolutionary methods for identifying predictive genes.用于微阵列数据分析的特征选择与分类:识别预测基因的进化方法
BMC Bioinformatics. 2005 Jun 15;6:148. doi: 10.1186/1471-2105-6-148.
7
Interactively optimizing signal-to-noise ratios in expression profiling: project-specific algorithm selection and detection p-value weighting in Affymetrix microarrays.在表达谱分析中交互式优化信噪比:Affymetrix微阵列中特定项目的算法选择和检测p值加权
Bioinformatics. 2004 Nov 1;20(16):2534-44. doi: 10.1093/bioinformatics/bth280. Epub 2004 Apr 29.
8
Engineering Aspects of Olfaction嗅觉的工程学方面
9
Comparison of feature selection and classification combinations for cancer classification using microarray data.使用微阵列数据进行癌症分类时特征选择与分类组合的比较。
Int J Bioinform Res Appl. 2009;5(4):417-31. doi: 10.1504/IJBRA.2009.027515.
10
Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm.使用一致性算法对大型DNA微阵列数据集进行稳健的多尺度聚类
Bioinformatics. 2006 Jan 1;22(1):58-67. doi: 10.1093/bioinformatics/bti746. Epub 2005 Oct 27.

引用本文的文献

1
Predicting high-need high-cost pediatric hospitalized patients in China based on machine learning methods.基于机器学习方法预测中国高需求高成本儿科住院患者。
Sci Rep. 2025 May 8;15(1):16006. doi: 10.1038/s41598-025-99546-z.
2
Highly heterogenous humoral immune response in Lyme disease patients revealed by broad machine learning-assisted antibody binding profiling with random peptide arrays.通过随机肽阵列的广泛机器学习辅助抗体结合分析揭示莱姆病患者高度异质的体液免疫反应。
Front Immunol. 2024 Jan 22;15:1335446. doi: 10.3389/fimmu.2024.1335446. eCollection 2024.
3
Modeling the sequence dependence of differential antibody binding in the immune response to infectious disease.

本文引用的文献

1
Statistical methods for analyzing immunosignatures.用于分析免疫特征的统计方法。
BMC Bioinformatics. 2011 Aug 19;12:349. doi: 10.1186/1471-2105-12-349.
2
Application of immunosignatures to the assessment of Alzheimer's disease.免疫特征在阿尔茨海默病评估中的应用。
Ann Neurol. 2011 Aug;70(2):286-95. doi: 10.1002/ana.22405.
3
Exploring antibody recognition of sequence space through random-sequence peptide microarrays.通过随机序列肽微阵列探索抗体对序列空间的识别。
模拟传染病免疫反应中差异抗体结合的序列依赖性。
PLoS Comput Biol. 2023 Jun 20;19(6):e1010773. doi: 10.1371/journal.pcbi.1010773. eCollection 2023 Jun.
4
Entropy is a Simple Measure of the Antibody Profile and is an Indicator of Health Status: A Proof of Concept.熵是一种简单的抗体谱测量方法,也是健康状况的指标:概念验证。
Sci Rep. 2017 Dec 22;7(1):18060. doi: 10.1038/s41598-017-18469-6.
5
General Assessment of Humoral Activity in Healthy Humans.健康人体体液活性的综合评估。
Mol Cell Proteomics. 2016 May;15(5):1610-21. doi: 10.1074/mcp.M115.054601. Epub 2016 Feb 22.
Mol Cell Proteomics. 2011 Mar;10(3):M110.000786. doi: 10.1074/mcp.M110.000786. Epub 2010 Nov 9.
4
A general method for characterization of humoral immunity induced by a vaccine or infection.一种用于描述疫苗或感染引起的体液免疫的一般方法。
Vaccine. 2010 Jun 17;28(28):4529-37. doi: 10.1016/j.vaccine.2010.04.061. Epub 2010 May 5.
5
A systems immunology approach to the host-tumor interaction: large-scale patterns of natural autoantibodies distinguish healthy and tumor-bearing mice.一种研究宿主-肿瘤相互作用的系统免疫学方法:天然自身抗体的大规模模式可区分健康小鼠和荷瘤小鼠。
PLoS One. 2009 Jun 25;4(6):e6053. doi: 10.1371/journal.pone.0006053.
6
Peptide microarrays for carbohydrate recognition.用于碳水化合物识别的肽微阵列。
Analyst. 2009 Apr;134(4):650-2. doi: 10.1039/b823156g. Epub 2009 Feb 11.
7
Validation of peptide epitope microarray experiments and extraction of quality data.肽表位微阵列实验的验证及质量数据的提取。
J Immunol Methods. 2007 Dec 1;328(1-2):1-13. doi: 10.1016/j.jim.2007.07.015. Epub 2007 Aug 13.
8
Three methods for optimization of cross-laboratory and cross-platform microarray expression data.跨实验室和跨平台微阵列表达数据优化的三种方法。
Nucleic Acids Res. 2007;35(10):e72. doi: 10.1093/nar/gkl1133. Epub 2007 May 3.
9
Antibody-based enrichment of peptides on magnetic beads for mass-spectrometry-based quantification of serum biomarkers.基于抗体的磁珠肽富集法用于基于质谱的血清生物标志物定量分析
Anal Biochem. 2007 Mar 1;362(1):44-54. doi: 10.1016/j.ab.2006.12.023. Epub 2006 Dec 20.
10
Bayesian neural network approaches to ovarian cancer identification from high-resolution mass spectrometry data.基于贝叶斯神经网络的从高分辨率质谱数据中识别卵巢癌的方法。
Bioinformatics. 2005 Jun;21 Suppl 1:i487-94. doi: 10.1093/bioinformatics/bti1030.